10 !: s- 



I.I 



[25 lllll 1.4 



1.6 



Mil > 



ERIC 



DOCOBBIT BESOBB 



ED 174 668 TB 009 573 

TITLE ProcGGdings of the Invitational Conference on Testing 

Frobleas, (Neir lork^ New Xozk, October 29^ 1S55). 
IHSTITOTION Educational Testing Service, Princeton, N.J. 

POB DATE 29 Cct 55 

NOTE 1t48p. 

EDES PBICE RF01/FC06 Plus Postage. 

DESCPIPTOFS Adults; ♦Aptitude Tests; Educational Counseling; 

Employnent Services; Governnent Eiployees; 
♦Occupational Guidance; Occupa ticiial Tests; 
♦Predictive Keasui anent ; Psychological Testing; 
Psychologists; ♦Testing Problens; ♦Test 
Intsrpretation ; Test Pesults; Vocational Aptitude; 
Vocational Counseling 

ABSIFACT 

The conference focused upon the csers of tests in 
coonselicg and guidance. The first session centered on Bulti-f actcr 
ability test batteries, with papers or Use of Multi-Factor Aptitude 
Tests in School Counseling, ty Eobert D. North; Use of the General 
Aptitude Test Eattery in the Enploynent Service, by Paulinp F. 
Andersen; Service Tests of Multiple Aptitudes, ty Edward E. Curetcn; 
and Logic of and Assuapticns anderlying Differential Testing, by John 
W. French. Papers in the second session considered methods of 
iiprovirq ccanunica tion of test information. Particular attention %/as 
given to the responsibility of the test user fcr initiating and 
•aintaicirq c cmmunica tion with the test author and publisher. Papers 
were given by John 1. Gustad on Helping Students Understand Test 
Infcriaticn; Alexander G. Hesaan on the Cb ligations of the Test ns?r; 
and David H. Dingilian on How Basic Organization Influences Testing. 
The lucheon address was a re-exaaiaa tion of the rcle of th^ 
psychologist in Bodern socie-^y, presented by Korris S. Viteles. The 
final s<:^sion revi-awed ^he relative merits of clinical and actuarial 
approaches to prediction. Participants in th^ panel were Nevitt 
Sanford, Charles C. KcArthur, Joseph Zubin, Lloyd G. Humphreys, and 
Paul E. Meehl. (BK) 



♦ Feproduc* ior?; 5;upFli€d by ^DPS are the best that can be made ♦ 

♦ f r OB the or i ainal docu mer t . * 



EKLC 



BOARD OF TRUSTEES. 1955-1956 



Lewis Jones, Chairman 



Arthur S. Adams 
Samuel T. Arnold 
Frank H. Bowles 
Charles W. Cole 
Donald K. David 
John W. Gardner 
Henry H. HiU 



George D. Stoddard 
Benjamin C. Willis 



Wallace Macgregor 
Katharine E. McBride 
William G. Saltonstall 



Frederick L. Hovde 
Clark Kerr 



OFFICERS 

Henry Chauncey, President 
Richard H. Sullivan, Vice President and Treasurer 
William W. Turnbnl!, Vice President 
ilenry S. Dye/, Vice President 
Jack K. Rimalover, Secretary 
Catherine G. Sharp, Assistant Secretary 
Robert F. Koikebeck, Assistant Treasurer 



COPYRIGHT. 1956. EDUCATION \L TESTING SERVICE 
20 NASSAU STREET. PftiNCETON. N. J. 
PRINTED IN THE UNITED STATES OF AMERICA 
library of Congrew Catalog Number: 4 r'M220 



INVITATION A.L CONFERENCE 

ON 

TESTING PROBLEMS 

OCTOBER 29, 1955 

RALPH F. BERDIE. Chairman 

Multi-Factor Ability Test Batteries in 
Counseling and Guidance 

Communication of Test Information 

Clinical vs. Actuarial Prediction 



EDUCATIONAL TESTING SERVICE 

PRINCETON, NEW JERSEY I.QS ANGELES. CALIFORNIA 



FOREWORD 



The 1955 Invitational Conference on Testing Problems was concerned 
with a number of siKnificant and timely topics. Participants, representing 
diversified backgrounds and opinions in testing, psychology, and related 
fields, considered the problems involved in the use of multi-factor test 
batteries in counseling and guidance, methods of improving communica- 
tion of test information, and the relative merits of clinical and actuarial 
approaches to prediction. Professor Viteles' luncheon address was stimu- 
lating in re-examining the role of the psychologist in our society. 

This published record of the proceedings of the 1955 Conference will, 
I hope, convey to an even greater audience the many new ideas and 
developments reported and discussed at this conference. 

The Chairman of the 1955 Conference, Ralph F. Berdie, met well the 
challenge of creating an interacting, informative, and successful program. 
To him and to the speakers I would like to offer our sincere thanks for 
a job well done. 



Henry Chauncey 
President 



PREFACE 



Users of tests in counseling and guidance make many educational, 
psychological, and psychometric assumptions, frequently without being 
completely aware of the nature of these assumptions. The primary pur- 
pose of the 1955 Invitational Conference on Testing Problems, sponsored 
by Educational Testing Service, was to explore and clarify some im- 
portant principles underlying the counselors' and clinicians' use of tests. 

Unlike conferences of immediately preceding years, during which dif- 
ferent sessions had been presented simultaneously, this year four con- 
secutive sessions provided an opportunity for the 500 persons attending 
the Conference to listen to the papers reproduced in this report. 

Problems of theoretical and practical importance to users of differen- 
tial aptitude tests were discussed during the first session. The papers 
presented descriptions of uses of these tests, descriptions and evaluations 
of the available tests, and an analysis of some of the assumptions 
underlying the development and use of differential tests. 

Problems of communication among persons concerned with testing 
were discussed in the second session. Particular attention was given to 
the responsibility of the test user for initiating and maintaining com- 
munication with the test author and publisher. Some refreshing and new 
points of view were presented from the orientation of an educational 
administrator. 

Participants in the Conference were particularly fortunate to have 
Dr. Morris Viteles read a paper which analyzed the responsibilities of the 
psychologist in modern society. The important issues discussed by Dr. 
Viteles are of interest not only to psychologists, but to all persons whose 
daily activities bring them into casual or continuing contact with 
psychologists. 

A topic which has received much unsystematic and often heated dis- 
cussion in the past was reviewed with care by the participants in the 
afternoon session, the question of the relative efficiency of the clinician 
and the statistician as predictors. Of particular significance is the paper 
by Dr. Meehl, which unfortunately because of lack of time, he was 
unable to read at the Conference. 

The success of the Conference was due entirely to the careful work 
and preparation done by the persons participating on the program and 
by the very efficient arrangements made by Mr. Jack K. Rimalover and 
Mrs. Catherine G. Sharp. The chairman also wfahes to express his ap- 
preciation to the many members of the staff of Educational Testing 
Service who made this programpossible,particularly to Dr.Henry S.Dyer. 

RALPH F. BERDIE 

Chairman 



CONTENTS 



FOREWORD by Henry Chauncey 3 

PREFACE by Ralph F. Berdie 5 

GENERAL MEETING 

''Mulli'Faclor Abilify Test Batteries in Counseling and Guidance"' 

The Use of Multi-Factor Aptitude Tests in School 
Counseling 

Robert D. North, Educational Records Bureau 11 

The Use of the General APTrruDE Test Battery in the 
Employment Service 
Pauline K. Anderson, New York State Empkymeni Service, . 16 

Service Tests of Multiple Apttfudes 

Edward E. Cureton, University of Tennessee 22 

The Logic of and Assumptions Underlying Differential 
Testing 

John W. French, Educational Testing Service 40 

GENERAL MEETING 

''Communication of Test Information" 

Helping Students Understand Test Information 

John W. Gustad, University of Maryland 51 

The Obligations of the Test User 

Alexander G. Wesman, The Psychological Corporation 60 

How Basic Organization Influences Testing 

David H. Dingilian, Harbor Junior College, Los Angeles 
City Schools 66 

LUNCHEON ADDRESS: ''The Psychologist and Society" 

Morris S, Viteles, Professor of Psychology, University of 
Pennsylvania 78 

7 



ERIC 



PANEL DISCUSSION 

**Clinical vs. Actuarial Prediction" 

Nevitt Sanford, Vassar College. 93 

Charles C. McArthur, Harvard University 99 

Joseph Zubin, Nav York State Psychiatric InstUule 107 

Lloyd G. Humphreys, Personnel Research Laboratory, Lackland 

Air Force Base 129 

Paul E. Meehl, University of Minnesota 136 

Appendix 143 



8 



GENERAL MEETING 

Multi-Factor Ability Test Batteries in 
Counseling and Guidance 



9 



ERIC 



The Useof Multi-Factor Aptitude Tests 
in School Counseling 



ROBERT D. NORTH 



Although teachers and administrators in many school systems still 
rely heavily on general intelligence tests for evaluating academic apti- 
tude, trained counselors are turning in increasing numbers to the use 
of multi-factor aptitude tests for individual guidance purposes. They 
look to these tests for the di£ferential measurement of the various 
aptitudes that are rekted to academic and vocational success. 

Among the multi-factor tests that are currently available for school 
use are the Chicago Tests of Primary Mental Abilities, the SRA Primary 
Mental Ability Tests, the Guilford-Zimmerman Aptitude Survey, and 
the Holzinger-Crowder Uni-Factor Tests. The General Aptitude Battery 
of the United States Employment Service is offered for school use with 
high school seniors in some states. This licit may be augmented by in- 
cluding multiple aptitude batteries that are based in part upon factor 
analysis research, or that yield some measures which are essentially in 
the factor domain, such as the Yale Educational Aptitude Tests, the 
California Test of Mental Maturity, the Differential ^.ptitude Tests, 
and the California Multiple Aptitude Tests. As a group, these batteries 
provide a coverage of a wide range of factors, extending from those that 
lie mainly in the area of intelligence to those that represent special 
aptitudes of vocational nature. 

The approach ofTered by factor analysis — that of measuring human 
abilities in terms of well-defined primary dimensions — is certainly ap- 
pealing to the school counselor. However, the practical usefulness of 
factor scores, as of any other test scores, definitely depends upon their 
reliability, validity, and normative interpretability. In addition, dif- 
ferential prediction requires that differences among aptitude scores be 
reliable. Since I presume that more technical discussions of these prob- 
lems will be presented by some of the other speakers this morning, I 
shall comment only briefly on these topics. 

In regard to the reliability of multi-factor tests, test authors find that 
they have to strike a delicate balance between maintaining optimal 
reliability standards and meeting the practitioner's demand for test 
batteries that can be given within reasonable time limits. Fcr example, 
the prototype of the current multi-factor tests — the Chicago Tests of 
Primary Mental Abilities (separate booklet edition) — requires about 

n 



Jo 



12 



1955 INVITATIONAL CONFERENCE 



four hours of administration time for the total battery at the intermediate 
age level. This administration time is reduced to approximately forty- 
five minutes, though, in the SRA edition of the Primary Mental Ability 
Tests at the corresponding age level— evidently to meet the requirements 
of test users for a shorter battery. While the reliability data given in the 
Thurstone PMA manuals are not sufficiently adequate to permit a 
direct comparison to be made of the reliabilities of these two edition's, 
evidence cited by Anastasi (I) indicates .hat some of the tests in the 
abridged edition are too low in reliabilily for satisfactory use in intra- 
inuividual measuremefit. 

In order to fit multi-factor tests within practical time limits without 
sacrificing the needed degree of reliability of measurement the counselor 
may find it necessary to restrict measurement to certain selected factors. 
In the case of most multi-factor batteries, the component tesU may be 
given separately if time limits do not pormit the administration of the 
rntirr battery. A new measure— the Holzinger-Crowder Uni-Factor 
Tests— provides for the evaluation of just the verbal, spaUal, numerical, 
and rtN'isoning factors, which are generally found to be more closely 
relatwl than other factors to academic achievement. Two periods of 
approximately forty-five minutes each are required for administering 
this test. 

Another approach would be to tise a short multi-factor battery for an 
initial appraisal, and then to supplement this with more intensive meas- 
ures of certain factors. In other words, it might be desirable to have a 
multi-factor survey tost that would be coordinated with a series of 
diagnostic factor tests, ji:st as we have survey and diagnostic tests in 
the reading jirea. Perhaps -.^iilti-factor aptitude test batteries of this 
i\pr. may be published in the future. 

It is difficult to make any brief generalizations about the validities 
of multi-factor tests in connection with their uses in school counseling. 
We might note, though, that the authors of these tests are te:iaing to 
give more attention to concurrent and predictive validity, rather than 
resting their cases entirely on content and construct validity. For in- 
stance, a considerable amount of academic validity data is to be found 
in the compn^hensive manual for the DifTerential Aptitude Tests. It is 
encouraging, too, to find that some evidence of concurrent and pre- 
dictive validity for the new Holzinger-Crowder test and the California 
Multiple Aptitude Tests was gathered in advance of the release of this 
test for general use. 

As the multi-factor aptitude tests become more widely used, the 
adequacy of the norms will probably be improved. Where only the 
relative ranking of a student on the various factors is desired, the 



TESTING PROBLEMS 



13 



norms for most of the current multi-factor tests are r'jasonably satis- 
factory. But for more precise interpretations of the factor scores, more 
attention must be ^iveii to the representativeness of the norm groups. 

For example, ofie of the important functions served by aptitude tests 
in the school situation is that of providing an objective basis for evaluat- 
ing a student s academic achievement in terms of his learning ability. 
For this type of comparison it is desirable that both the achievement 
and aptitude test scores be iiiterpretable in terms of the same or very 
similar norm groups. However, such a condition is not likely to be met 
in the national norms, except where the multi-factor aptitude tests and 
achievement tests are prepared by the same publisher, and probably 
not even then. Establishing local norms is not an entirely satisfactory 
solution, since some of the advantages of national standardization are 
thereby forfeited. The large-scale statewide and independent school 
testing programs have been successful in providing comparable norms 
on some of the aptitude and achievement tests, but there are many 
schools that do not fall within the scope of these programs, and hence 
the norms are n</. applicable to them. This problem is not unique to 
multi-factor tests, of course. Let us hope that some solution may be 
found short of basing fiorms on Toops' **standard million/* 

The question of the magnitude of the differences that must be regis- 
tered among factor scores before such differences may be used as a basis 
for differential prediction is one with which the school counselor needs 
ronsidfe.'-able assistance from the test technicians. Perhaps the test 
authors and publishers may develop some improved tecJiniques for 
helping the test users to understand the necessity for making allowances 
for the standard error of score differences in the interpretation of test 
profiles. A noteworthy forward step in this direction has been taken by 
KTS in connection with the profde charts that have been prepared for 
the new Cooperative School and College Ability Tests. Discussions in 
the test manuals of the principles of standard errors are often helpful 
to the s('hool counselor if the ternn'nology that is used is not excessively 
technical. 

Turning now to more general considerations concerning the use of 
multi-factor t^^sls in school counseling, one of the questions that might 
be raised is whether these tests may be employed effectively at the ele- 
mentary school grade levels. On the whole, there seems to be very little 
evidence that factor scores have any practical advantages over general 
intelligence test scores at these lower grade levels. In the negative direc- 
tion, Carre tt*s (2) findings indicate that intelligence factors are relatively 
undifferentiated among young children, and that mental abilities do not 
tend to become specialized until adolescence or early adulthood. How- 




II 



1955 INVITATIONAL CONFERENCE 



ever. 88 Vernon f3,pp. 29-31) pointo out, the relation between the pattern 
of mental organization and chronological age is not clear-cut, and the 
rt^rch data in this area are often diflicult to interpret because the 
variables of group heterogeneity and apprtipriateness of lest content 
for different grade lev-Is are not adequately controlled. At the present 
stage of development of multi-facior tests, though, it seems advisable 
that considerable caution Ih» observed in using these tests at the ele- 
mentary school level, and that careful attention be given to the cor- 
relations among the factors and the degree of rehability of the differences 
among the scores. 

Regardless of the grade levt l at which the multi-factor tests are used, 
it is essential that the rouns4*loi in mind the difference between the 
9eie<*tion and guidance applications of the results. When tests are used 
for selection purposes^ the principal objective is to evaluate the indi- 
viduuls pres<»nt aptitudes in order to predict his probable academic or 
fx cupational success. In this case, relatively little consideration is given 
in the jKwsibility of improving the individuals aptitudes when they are 
low. In the guidance use of the test results, on the other hand, it is not 
only ihv individuaPH pres«uit aptitudes that are important, hut his 
potential for development as well. K a low aptitude score may be a 
reflection of laci. of opi^ortunity for development, some attention should 
l>e given to the possibility of encouraging the improvement of this 
aptitude through guidance and instruction. 

For example, supposi* an indiviiiual has a high verbal aptitude score 
and a low nuniericnl aptitude score. Do^^s this mean that the student 
should b<» ftititM towanl si hool courses that emphasize verbal skills, 
and awiy from conrs<'s involving mathematirs? Or should the student 
be stimulattti to improve hLs numerical ability!^ A satisfactory answer 
to this qut»Hti<*n prolKihly rt»quin*s more infitrmation about the under- 
lying cau-sen <»f the diffm ntiation of mental abilities than is presently 
available. In the abstuHV of any (Conclusive evidence to the ctMitrary, it 
would well to keep in mind Vernon's point of view that "factors over 
and aUive « ari.He. piirtly i>i»rhaps from hereditary influences. Injt mainly 
liecaus*' an individnaPs upbringing and education imposes a certain 
griHipingfin his bonds ' a, pp. 31-32). In any event, it is important that 
CfHmselopi rei^ignixe that thi y must make ccTfain assumptions about 
the deti>rminanU of intra-individual differencing in mental abilities when 
they eouns«*l from profiles. 

\Vith thi' attention that in now Ix-ing given to multi-factor aptitude 
l#»t scores, one might »sk whether there is any value in the general in- 
telltgence test score, oi mtHligence quotient, in the school situation. 
It may be recalliH that whm the Chicago Tt^U of Primary Mental 



is 



Tt:STING PHOBLKMS 



15 



Xbilitiet were iritrudure<i* the Thurstones stressed that general intel- 
ligence scores are of littlr value and that only factor scores should be 
used. They did not provi(l«* fur an ( ver-all iiit«?lligence quotient on the 
original edition of the t«»st. However, they soon found that when schcxjl 
teachers and counselors u^^e an intelli^enre U*sl, they exrc^rt to get an 
intelligence quotient from it. The SRA Primary MeriUM Alilities Tests 
now provide, in addition to factor scores, a total scor.* thaf is designed 
to serve as a single index or avera>?e of the child's irit*Uigerce level. 

Whether the IQ is us«m1 as a measure of or merely iuh a summary 
of the pupil's performance on the t«*st as a whole, it dot*s .seem to have 
bome practical utility. It provides a basis for wxtioriing pupils, where 
this must be done on an o\er-all arudemic ability basis; it serves as a 
general guide for t^timating th«* d«*sirability of encouraging a pupil to 
prepare for a rollege can^-r; and it is a key vocational contiseling to 
the grneral occupational level for which the .Undent Hh<iuld aim. 

Of ronrs*». if a general inlelligenre scon- is d(*sired for purposes such 
as thes<r, it may Im* obt^iiu d from rither an omnibus test or from one 
of the mijlti*factor tests that yields a ron posi^! s<'ore in addition Ui 
factor ?Mor»»s. In the latter cas*». it might In* desirable for counselors to 
know the i»lTectivi» \h-Xh wrighls of the factors, instead of jusl the raw 
seore weights that an* used for computational purposes, s<i that they 
might Im^ in a Ix'tter i^MMition to interpret such discrepancies as may In* 
found among th^* total scores of a singit* individual on .several miilt'- 
factor testis. 

In summary, the nuilti-factor tt»sts apparently are lH*girming to meet 
the neecJs of the school counselor for a means of making difrereritial 
predictions of academic and vocational success. l*he practical usefulness 
of these tesU in the sihtMil situation will increase as the reliability, 
validity, norms, and thiwetical framewv^rk of the factor scores become 
more ad(M]uately established. Meanwhile, the general intelligence score 
contirmes to have an important role in si*h(M)l counselling, particularly 
at the elementary si hrMil grade levels. 



I \nnr. An rmpinral r4»mp«rMm of cnrtnin U^:htiMiui*H fur f^tiiiiflUiig 

the rrliability of ^pi^rdrd ir^U. Ediheatumal and PrrrfyUitgiml VeoMnrrmtrU. I9S4. 

T Garshtt. II. K. \ d«»\ elopnipntal tht»«iry of inMligrnr«i. i-nerirnn Pgythnlogitt, 

3. VcuNow. P. R Thf Slrutiure of Human AhitUiet. I^mdon: NUvhunn «! CU» . Lt' 
NVw York John WiUy A Sonsi. Inc . 1950 



16 1955 INVITATIONAL CONFERENCE 



The Use of the General Aptitude Test 
Battery in the Employment Service 



PAULINE K. ANDERSON 



A national system of public employment service offices administered 
by the states, but flnanced, coordinated and given general technical 
supervision by the Federal Government was established in 1933 by the 
Wagner-Peyser Act and has been in continuous existence since that time. 
The Agency of the Federal Government which supervises and coordi- 
nates the State Services is the United States EmpIoym«?nt Service of 
the Bureau of Employment Security of the U. S. Departiment of Labor. 

Among the functions for which State Employment Services nationally 
are responsible are, of course, placement iLv 'If ; i.e., providing the right 
worker for the right job at the right time and secondly for the vocational 
counseling of those who need help in choosing suitable Celds of work or 
help in resolving a wide variety of job adjustment problems. The testing 
program of the Employment Service which consists of bo )i aptitude 
and proficiency tests is the result of continuous research since the 
establishment of the agency — research i^hich nas involved the coopera- 
tion and participation of workers, employers, unions, schools and colleges 
throughout the country. The General Aptitude Test Battery which* as 
I am sure you know, is a multi-factor test, is used primarily in connec- 
tion with our counseling program, especially with young workers just 
entering the labor market. But it b used successfully also in the counsel- 
ing and placement of other applicant groups, particularly veterans, 
Met wothL^ and displaced workers. Like any other test used by the 
Employment Service, the GATB is a technique by which we attempt 
to make our counseling and our placement more accurate and more 
effective. It is an integral part, in other words, of our total service to 
both applicants and employers; it is used only where it is needed; its 
results are interpreted in the light of total pertinent infcmnation about 
the individual and about jobs. 

As an Empbyment Service our primary interest is of course in the 
occupational qualifications of our applicants. Our tests therefore are 
designed to measure qualifications for occupations as these have been 
determined by experimental evidence secured primarily, though not 
exclusively, from samples of workers perfmming successfully in the 
particular occupations for which we have develc^>ed test nomis. Our 



TESTING PROBLEMS 



17 



aptitude test battery, for example, for Boarding Machine Operator, a 
job found in hosiery manufacturing, consists of a combination of single 
tests which in combination have been found to be efTective in dis- 
criminating between better and poorer employed Boarding Machine 
Operators. This is an aptitude test battery which our offices might 
utilize to select an applicant inexperienced in the occupation for re- 
ferral to an employer who had placed an opening with us for a Boarding 
Machine Operator trainee* 

However, those of you who are counselors know that when you are 
dealing with a [)erson who needs help in choosing or confirming a vo- 
cational goal you cannot start with tlie requirements of a specific job 
opening. You must start rather with the individual himself and try to 
appraise his vocational aptitudes as broadly as possible and relate these 
Hptituties to the reciuirements of jobs. As many of you know also there 
an* c|uite literally thousands of specific jobs, in fact some 25,000 have 
lMM»n defined by the Dictionary of Occupalional Titles. Fortunately 
hi>w»»ver, many of these can be grouped into job families on the basis of 
various kinds of similarities. Such groupings make it possible and 
practical to .help an applicant select broad areas or fields of work in 
which he has chances for successful performance, vi'ithin which there 
are a large numlwr of specific jobs any one of which may serve as a 
starting point for him. The (jATB, because of its nature, allows the 
counselor to do exactly these two things; that is, explore applicants* 
abilities broadly and relate them to the aptitude requirements of fields 
of vyork established on the basis of the similarity of their aptitude 
re<|iiirements. The battery, consisting of 12 single tests 8 of which are 
paper and pencil ami t of which are apparatus, measures nine voca- 
tional aptitudes which have been found to be of most significance in 
most jobs occurring in this country today and relates the individual's 
aptitude scores in each of these nine to the aptitude requirements of 
many broad fields of work which include well over 3,000 specific occupa- 
tions. Thw amount of occupational coverage is secured from a group 
test session whi<*h lasts approximately 2H hours. The aptitudes the 
battery measures are general intelligence or t^cholastic aptitude (G), 
numerical (N), verbal (V), and spatial (S) aptitudes, clerical percep- 
tion (Q), form perception (P). motor-coordination (K), and finger and 
manual dexterities (K and M). These aptitudes originally were identified 
by means of factor analysts studies which involved the administration 
of 59 single tests to a sample of oV(»r 2.000 individuals. The fields of work 
for which the battery h< ores represent a wide range of type and skill level 
ranging from semi-skilled machine tending and machine operating, 
simple inspection and routine clerical work through skilled machining. 



16 



18 



1955 INV ITATIONAL CONKEftENCIi 



printing and bookktrpin^', to profeeaional nursing, teaching, acrountinp. 
enginwring, vU\ 

The butlery's n^allH are exprtSHiMl in two forms: Thr Individual 
Aptitude Prolilr; that is, the aptitude scores obtained in each of the 
nine aptitudes irieasurtHl. This profile gives a numerical representation 
ui the applicant's own strengths and weaknesses as well as his strengths 
and weakn(»sses as eonipared with the genenil working population of 
this country, (ieneral population norms were establishtnl on the basis 
of a stratified quota Siunple of 1.000 workers selected from over 8,000 
rases reflecting an exart representation of the occupational distribution 
of tlie national labor forrr* as given by the 1910 census <;.xcept for certain 
deliberate exclusions of farm, forestry, mining and personal service 
txTupations. In addition to its ocrupational representativeness the 
sample is typiral of the aps sex and geographical distribution of this 
rountry's working population. The mean for each aptitude score has 
been set at 100 with a standard deviation of 20. Ilius the counselor is 
able to see by a glanre at the Individual Aptitude Profile whether the 
applii aiil generally tends to meet, exceed, or fall below this average. He 
ran see at a glaiire also in what kinds of aptitudes he tends to excel or 
fall below: that is» the cognitive vs. the motor, the more abstract vs. 
the simpler and more eonrrete. etc. The profile also makes it possible 
to .see quiekly where the applicant's own best abilities lie regardless of 
his standing in Delation to the general population. Thus it can be seen, 
for example, that John Jon*^ d<M-s best in numerical and spatial apti- 
tudes and pof)n'st in verbal aptitude and clerical pereeption. 

As I indieated earlier, however, onr main ronrern is with the occuoa- 
tional qualifications of each applirant. Therefore in day to day work 
with the (i.\TH. the counselors main interest is centered less in the 
Individual Aptitude Profile than in the fields of work for which the 
applieant qualifies. The (TXTB indicates such field.s by means of Oc- 
enpational Aptitndi^ Pattern.^; whirh ronsist of jobs grouped together 
on the basis of the similarity of their aptitude n^quirements. Those jobs 
which havr been found to n^iuire the same combination of significant 
aptitudes to the same mininnim degree constitute an Decupational 
Aptitude Pattern. Kach pattern uses the multiple cut-ofT method to 
determine o4eupational qualification or non-qualification at least on 
the basis of t<Nt performanee. Thus, the individual is considered qualified 
for an Orrupationiirl Aptitude Pattern only if he meets the minimum 
.Hcore on each of the aptitudes found to be significant for this particular 
family. The aptitude requirements themselves have been established 
on the basis of experimental evidence secured from samples of workers 
employwl in the <irrupations making up the field or Occupational 




TliSTING PROBLEMS 



19 



Aptitude Pattern plus, in some instances, samples of senior students 
and/or apprentices successfully completing particular courses of train- 
ing for certain (x^cupatioris on the vocational, technical or professional 
level. 

One of the most significant contrihuti<'ns mado by the GATE lies in 
the fact that it helps to underline what nio^t of us know but too fre- 
quently iviul to forget in pnicti^v; namely, that mast people can do 
more thau one thing well and that a high degree of what we call general 
inteUigence or scholastic aptitude is of primary importance in only a 
rather small percentage of the total number of existing jobs. Thus, the 
GATB brings out the fact that even for persons of rather limited intel- 
lectual ability there exist many occupational outlets in which their 
per forma rice Ciin be not marginal but truly successful. For example, here 
is a high school graduate whose G, V, N scores are 93, 90, 90 but who 
still (jiialifu's for s<»ven fields of work two of which consist of many kinds 
of clerical jobs. Il^'re is anollier case of a girl who probably is defective — 
her wrongs an* 67, 68, 67. and 78 in f I, V, N, and S respectively but who 
still <|ualifies fnr four different fields of work, one of which in itself 
includes literally hundn'ds of senii-skiiled industrial (x^cupationj- in- 
volving machine li-nding and operating. 

The GATH also of durse helps uncover those who could profit from 
higher e<iuralion. We encf»urage such people to consider college when 
other evidence .ilso supuorls \hv «'vidence of the test, and frequently 
through referral U) and assislanrr from other community agencies make 
it p^Hsihle for them Ui io colK*gi'. One of our ofiices recently tested a 
high sch«K)l graduate wIm) had z\u thought of going on to college. l\v 
expn^sed int<'rest in eleri« al jobs but his G. V, N, and S scores wen' 
I i.i. 12'>, 1 tl and \M) nspeetively. lie (jualifi«*d, among many other 
things for Uith n( Cf)unting and engineering. College as a possibility was 
(liscusst'il with liini and he was referred to our Consultation Service for 
specilie information alN)ut individual colirg<*s and their requirements 
and ff»r |K)s,slbl.' iinariciul a,s.».istance. In the meantime he was plactxl on 
a snminrr job j»s a iWv clerk. 

\nf)tlnT applifiuit was a Jt year iM Korean Veteran, a high school 
graduati* whos»' <»iily civilian work experience ha<I been as an order 
filler an uiiskillt'fl job. I lis assignment in the Army was as a general 
ch»rk. Som»» of his ( iATI» scores were: 

(; V N S 
\M m 122 130 etc. 

He qnalifnMl for many ivr,,f(v^vsional and twhnical occupations. Ills 
Inten^.t rh<»ck IJst als.*, 5f>d;c:ite<l many scientific preferences. He ttoo 




20 



1955 INVITATIONAL CONFERENCE 



was referred to our Consultation Service for assistance in college plan- 
ning and he loo was placed — as a Chemist Assistant 

Thus the GATB by its nature makes it possible for us to help the 
applicant choose vcM-ntional goals which will utilize his maximum po- 
tentialities. And this is n very fundamental policy of both our counseling 
and plaronient activities— to make maximum utilization of the potential 
or actual skills of mir applicants. An applicant, for example, may 
(|:jalify n cording to tin test n^sults for a field of work which includes 
tx'cupations on a higlily skillwl trade level; e.g., machinist, tool and die 
maker etc. II v may qualify also for a variety of machine operating and 
tending jobs whicli are mostly of a semi-skilled nature. Unless there was 
sornti very good reason having to do with the individual himself, the 
counselor of course would encourage tlie more skilled occupations as 
the goal. Or an applicant may qualify for an Occupational Aptitude 
Pattern which utilizes only one or two of liis own best abilities and 
qualify also for a swond wliicli utilizes more of his own aptitude strengths 
or utilizes thern at a higlier level. Again unless there was other specific 
evidi»nce against it tlie counselor would encourage consideration of the 
si'eorul field, even if tlie skill classifications of the jobs in each were equal. 
Or. to take a converse situation, an applicant may express some interest 
in engineering as a VcM Utional goal hut we may find that, at least as far 
as tire lest results are concerned, he meets the minimum requirements 
for machinist b» does not cv»me anywliere near meeting the minimum 
requirenienf.j f ; either drafting or engineering. In other words the 
CiATI^ lielps indicate the uppermost level of skill within related Gelds 
of work to wiiich llu* nppliei;nt may aspire. 

Obviously, when we use the term niaiinuini utilization we use it in a 
relative not an absolute sense. Sometimes an applicant may just barely 
meet tlie mininuun requin'ments of a field but show greater strength for 
a relatnl field on a slightly lesser level of skill. It may be *hat for this 
individual maximum utilization would be achievwj more successfully 
tlirougli file somewhat lower level jobs. Tliis brings me to the most 
important point that I wisli to leave witli you. one which I mentioned 
eiirlier: namely, that we u.sc test information as only one piece of in- 
ronna.Min. important though it may be, nbodt a total individual who 
is not after all niafle up just of aptitud<»s nor whose job success will he 
hmM solely on aptitu(h»s. His interests*, his goals, his education, his 
work history if any, all must be taken into account in order to determine 
accurately Lis total occupational qualifications and to make relatively 
liccurate pn»ilirtions of liis probable job performance. In the Elmploy- 
nuMit S<»rvice we us<» the short hand SKA PATl to emphasize the variety 
of factors which must \h* evalur.ted in order to arrive at a sound occupa- 

19 



TESTING PROBLL'. S 



21 



tional classification which in turn will lead to accurate matching of the 
applicant's qualifirations with job requirements. 

SKAPATI is translated as follows: The ''SKA'' equals skills, ifenowl- 
edge and abilities — information al>out which is secured primarily through 
interviewing the applicant nl)out his school and work record supple- 
men tecJ, where necessiiry, by an actual school record or by a check with 
former employers. The '*P'* of SKAPATI stands for pliysical capacities* 
that is, the general physical condition of the applicant and/or any 
physical disabilities which hf may have that need to be Uiken int^) 
account in helping him find a suitable field of work. Information about 
physical capacities is Kerured through interview information, observa- 
tion and where necessary by medical rejxjrts. The second **A'' is for 
aptitude information which is secured througli interv, jw*information sup- 
plementtKl by aptitude test mea** . .»ment. *T'' and stand for personal 
traits and interests, in forma troi nlxjut wliirh ere secured again primarily 
through the intrrv i<»w and observation plus the use of an Interest Check 
List ami/or rep4)rts from srhfK»ls, former employers, so<'iaI agencievS, 
etc. Sinci' all this information is given weight in the counseling situation 
there are instanc es in w hieh an applic'ant might very well be encouraged 
t<) work toward a goal t*vt^n thongh his test scores, \*'hen looked at in 
rsc;!:ition, might ind irate h'ss ehances of successful performances. Of 
roursr in siieli rases tlie counselor would indicate to the applicant that 
he might have to work a little hanler than others to achieve success or 
that he miglit have to bc» satisfied with satisfactory rather than out- 
standing performance. Hut the important point is that aptitjde quali- 
fications by themsc»lves, c»ven thongh derivc»d from an instrument as 
well standardizecl as the (iATH. neverllieless are never used as the sole 
basis for connsr'.ing or for plaeenient. In fact onr total counseling process 
and our use of tests in it a.-e root<Ml in an individual, analy tical cpprocch 
when4)y the rt>unselor att<»nipts to n^rognize and understand the par- 
tic nlar individual's uniqin»ness as an individual and to help him realize 
this tlinaigh lM)th his rhoire of w. rk and his actual entry into the field 
of his rhoiri\ 




22 1955 INVITATIONAL CONFERENCE 

Service*Tests of Multiple Aptitudes 



EDWARD E. CURETON 



The present trend in aptitude testing, both for educational and 
vocational guidance and for employment and placement, is away from 
both general intelligence testing on the one hand, and from measure- 
ment of specific vocational aptitudes on the other. A profile of aptitude 
scores provides more information than does an average, and even in 
the cognitive area there are important measurable aptitudes which do 
not fit the U3ual definitions of general intelligence. But the development 
of a special battery for every important occupation is a hopeless task, 
and even if we had such a set of batteries, no examinee could possibly 
take all of them. The results of the factor analysis studies made during 
the last 20-odd years point the way to a reasonable compromise. It 
appears from them that a battery of perhaps two or three dozen tests 
will mo?isure most of the important cognitive, perceptual, and sensori- 
motor aptitudes, and that a battery of only a half-dozen lo a dozen will 
measure the really crucial ones fairly well. For at least a large number 
if not a majority, of the thousands of difTerent occupations, the best 
prediction of success yielded by such a battery will not be improved 
greatly by the addition of special tests, so long as these latter still 
measure cognitive, perceptual, and sensori-motor aptitudes. Tests of 
interest attitude, personality, and the like are another matter, but in 
tht»se domains much work remains lo be done in test development 
before the procedures of factor analysis can be expected tc. yield any- 
thing approaching definitive results. 

This is not to say that in the domains here at issue — the cognitive, 
percfptual. and sensori-motor— the factor analysis results are as yet 
definitive, but in these areas they are at least highly suggestive, and 
thrre is enough agreement among them to permit some useful applica- 
tion. The trend toward such application has already appeared, and the 
purpose of this paper is to assess its present status. 

My first and most important task was to try to identify and charac- 
terize in some uniform language the factors measured by the tests of 
sixteen batteries. The material at the beginning of Table 1 lists the 



•By a service tent, we mean a te«t designed for service uses such as guidance, 
frnployment, pinrement, etc., a.s contrasted with the expriroental batteries used in 
numi fat ?/>r analysin studi<^. Batteri<^s developed by tne military services are not 
inc]iid<*tl II this review. 



21 



TESTING PROBLEMS 



. 23 



factors and abilities employed. The characterizations themselves are in 
the body of the table under the column heading, "Probable Factors." 
This was strictly an arm-chair job. The lists of factors and abilities are 
certainly incomplete. By any reasonable definition of a factor, some of 
those listed are too broad, and one or two may be too narrow. Many 
of the characterizations are undoubtedly in error, even in terms of 
these lists. 

Whenever it appears that a test should have high loadings on several 
important factors, I have listed several, in the order of their presumed 
importance or magnitude from left to right. There will be more errors, 
needless to say, in these judgments of order than in the judged factors 
themselves. Despite all their obvious shortcomings, however, I venture 
to hope that these characterizations will be suiBciently valid for most 
practical purposes until such time as we can conduct a large factorial 
study of several of the service batteries themselves. Such a study would 
also yield equivalent scores for the various batteries, and this also is so 
important that when such a study is made I am not sure which objective 
would be primary and which secondary. 

In deriving the list of factors I was fortunate in having available the 
work of a set of committees which, under the auspices of the Educational 
Testing Service, identified sixteen aptitude factors as fairly well estab- 
lished, and named at least three tests which could be recommended as 
reference tests for each in future factor analyses. For their objective, 
the preparation of lists of standard reference tests, two factors could 
be considered distinct whenever they could be defined in terms of two 
distinct groups of reference tests. In selecting reference tests, moreover, 
they could occasionally include one with a known high loading on the 
given factor but a higher loading on some other factor. But for my 
objective, the factorial characterization of tests, some of which have not 
appeared in actual factor analyses, a more rigid criterion for a distinct 
factor seemed essential. First, it must be distinct from all others in 
terms of explicit concepts, not merely in terms of difTerent sets of 
reference tests. Second, in terms of these concepts it must be broad 
enough to imply at least two or three reference tests which difler in 
apparent content. On applying these criteria to the committees* sixteen 
factors, it appeared necessary to modify some of them for my purpose. 

Though I agree with the committee that vocabulary is the central 
core of the verbal factor, I broadened it to include such tests as para- 
graph meaning, proverb matching, sentence completion, verbal analo- 
gies, and even general information. The general concept used might be 
termed verbal comprehension. There appears to be good evidence in the 
factor analysis literature for the existence of a verbal factor, or rather 




24 1955 INVITATIONAL CONFERENCE 



a group of verbal factors, covering this range. I am not sure, on the 
other hand, that the tools of English expression, such as spelling, 
grammar, punctuation, and the like, belong under this factor, and in 
fact their factorial structure seems not well determined so fan 
I listed them, therefore, as two abilities: spelling separately because it 
is usually tested separately, and the others under the general term, 
"language usage," because they are frequently measured together. By 
an ability I mean merely an area of measurement which is conceptually 
distinct but whose factorial structure is as yet not well determined. 

The deduction factor was retained, but the concept was narrowed to 
include only formal tests of the syllogistic type. This comes perilously 
close to violating my own criterion of breadth, but the factor analysis 
results required its retention. Two of the committee s reference tests 
were of the syllogistic type, but the third was a verbal analogies test, 
which, in those factor analyses with which I am acquainted, loads 
substantially ofi the deduction factor but higher Ofi the verbal factor. 

From such kfiowleflgc as I had of factorial literature, including the 
committee's report, I was unable to formulate clearly distinct concepts 
for the reasonifig factors other than deduction, so I lumped them all 
together under one heading. The same situation appeared in the case 
of the space factors and the fluency factors, so each of tliese groups was 
represented also by a single factor. 

To cover certain other tests in the sixteen batteries, I added an im- 
mediate memory factor to the committee's list. This also undoubtedly 
represents several factors, but we do not yet know how to differentiate 
among them. 

The committee listed only one perceptual speed factor, and this I let 
stand despite the evidence from the USKS studies that clerical speed 
may be a separate factor. 

To characterize one or two tesLs in the sixteen batteries, I postulated 
a factor, visual discrimination, which so far aa I know has not been 
found in any factorial study. I am quite certain it wzV/be found as soon as 
appropriate tests are included in such studies. Appropriate tests would 
include sets of lines, not parallel, with one just sliglitly longer or shorter 
than the rest to be identified; sets of arcs of circles with one having a 
just slightly longer or shorter radius than the others, or one which is 
not quite circular; sets of circles with smaller concentric circles within 
them and one not quite concentric; and the like. These tests, unlike the 
identical forms tests which measure the perceptual speed factor, would 
be administered with generous time limitji. They would, I predict, reveal 
a factor representing individual dilFerences in a more or less general 
visual difference limen. Drake has already shown, for the concentric 



TESTING PROBLEMS 



25 



circles test, that it is a valid predictor of effectiveness in several types 
of visual inspection work. 

Flanagan has argued with some cogency that reasoning and judgnneat 
are not the same, and the concepts as such are discriminable. So I added 
judgment as another unanalyzed ability, of which there are probably 
several distinct varieties. 

It was necessary to add two other abilities, mathematical achievement 
and science information, to cover tests in one or two service batteries. 
There have been some factor analyses of achievement tests, but the 
results of these studies do not yet seem to me to warrant confident 
identification of the factors generated by the study of school subjects 
above the elementary level. 

Finally, the characterizations were limited to factors and abilities 
measured by paper-and-pencil tests. Two of the service batteries include 
apparatus tests, but these tests and the factors generated by them do not 
appear in Table 1. 

Wherever service batteries were constructed directly from factor 
analysis data, the characterizations of them in Tabic 1 will usually 
agree substantiaiiy with those of their authors. Some exceptions will 
occur because of my use of combined factor-categories, and a few more 
due to differences in sheer nomenclature. Thus, my concept of the 
number factor is essentially speed and accuracy in solving problems of 
no intrinsic difficulty, but some authors use this term to cover the 
harder tests of arithmetic computation and problem*solving. In Table 
1, such tests are characterized as reasoning tests. 

The sixteen service batteries fall roughly into three categories. In the 
first we find those whose primary objective in test selection seems to be 
to approximate pure-factor measurement as closely as possible. In this 
category we find the first nine tests in Table 1 : 

ACE Primary Mental Abilities, 

Chicago Primary Mental Abilities, ages 11-17, 

SRA Primary Mental Abilities, ages 11-17, 

SRA Primary Mental Abilities, ages 7-11, 

SRA Primary Mental Abilities, ages 5-7, 

Holzinger-Crowder Uni-Factor Tests, 

Factored Aptitude Series, 

Guilford-Zinmierman Aptitude Survey, and 

USES General Aptitude Test Battery. 
In the second category we find batteries developed on a compromise 
basis. Range of factorial coverage was an objective but factorial purity 
of the tests was less important, and the particular tests were designed 
to be similar to others which had been shown to be valid predictors of a 




26 1955 INVITATIONAL CONFERENCE 

variety of educational and occupational criteria, or to measure abilities 
shown by job analyses to be important elements of many occupations. 
This category includes batteries IX through XII in Table 1, namely: 
USES General Aptitude Test Battery, 
DiiTerential Aptitude Tests, 
Multiple Aptitude Tests, and 
Flanagan Aptitude Classiflcation Tests. 
The USES General Aptitude Test Battery is a transitional type listed 
in both the first and second groups, and the Flanagan Aptitude Classi- 
fication Tests are listed in both the second and third groups. In the third 
category we find batteries designed to predict multiple criteria, often 
over only a limited range such as educational curricula, but sometimes 
over a very wide range. An attempt, necessarily somewhat hasty and 
hence probably less than completely successful, was made to include 
here all batteries belonging property in the first two categories, Lut ex- 
cepting those whose distribution is restricted to the armed forces, other 
government agencies, private firms, and organized testing programs. In 
the case of the third category, no such attempt was made. Table i' in- 
cludes only the tests in this category numbered XII through XVI: 
Flanagan Aptitude Classification Tests, 
Yale Educational Aptitude Test Battery, 
Aptitude Tests for Occupations, 
Engineering and Physical Science Aptitude Test, and 
Cleeton Vocational Aptitude Examination. 
These are merely a small group of such batteries— the ones which came 
to hand most readily during the preparation of this paper. They were 
included merely to show that the multiple validity approach and the 
multiple factor approach to battery construction often lead to quite 
similar productions. Factor analysis provides a clearer rationale, how- 
ever, and on the practical side it shows which valid tests can safely be 
omitted from a battery if certain others are present in it. 

To report in any detail the data on the relSabilities, validities, and 
norms of these batteries is impossible within the limitations of a short 
paper, and valid comparisons are impossible in the absence of data from 
several batteries given to the same group. And since computing methods 
as well as reporting methods vary from manual to manual, there seems 
to be no useful method of summarizing such information in tabular 
form. My remarks on these matters must therefore be brief, impres- 
sionistic, and somewhat scattered. 

For Thurstone's original ACE Test for Primary Mental Abilities, I 
had only copies of the booklets, but it is my impression that this battery 
is not very widely used for service testing, and that most of the studies 



25 



TESTING PROBLEMS 



27 



in which it has been used were experimental in nature and are reported 
in journal literature. Its immediate suceessor is the Chicago Tests for 
Primary Mental Abilities, but this battery is an easier version based on 
a second factor analysis, and is recommended only for children aged 
11 to 17. The data here and in Table 1 are for the single-booklet edition, 
a shorter version, of a previous multiple-booklet, separate-answer-sheet 
edition. The single-booklet edition is booklet-marked. Percentile norms 
for the six factor scores are provided for each half-year age group from 
11 to 17.5, based on the scores of about 18,000 Chicago childrefi in 29 
elementary and 31 high schools. \ profile chart is printed on the back 
cover of the test booklet. High correlations between the factor-score 
composites and the actual primary-factor scores are reported, along 
with the results of a second-order analysis of the correlations between 
the primary factors. This latter analysis yielded a single general factor, 
with the highest loadifjg on reasoning, fairly high loadings on verbal, 
fluency, and number, and lower loadings on space and memory. Reli- 
abilities are given for the test scores and the fnctor scores, but no cor- 
relations with exterfjal criteria are reported. 

The SRA Primary Mental Abilities for ages 11 to 17 is a shorter 
version of the same battery, with the memory tests omitted and the 
remaining five factors njeasured by one test each. The test booklet is 
re-usable, with a rfplaceable carbon-back ariswer sheet, and there is 
also a machine-scon?d edition. Percentile norrns for each year of age 
from 11 to 17-or-over are incorporated directly ifito the profile chart. 
They are based on the scores of those children of each given age who 
were in junior and senior high schools, not on the full age-ranges. The 
original norms were apparently developed from the data for the parent 
battery, but further adjustments were made on the basis of a second 
sample whoso size was not stated. Deviation-type (juotient norms are 
provided also on the profile sheet, with mean 100 corresponding to 
percentile 50, and standard deviation 16. The reliabilities and inter- 
correlations of the tests are reported, as are also their correlations with 
several intelligence tests, the Iowa Tests of Educational Development, 
the Stanford Achievement Test, and the USES General Aptitude Test 
Battery. For these last data, for a tenth grade sample and a twelfth 
grade sample, with the Minnesota Clerical Test and the Revised Minne- 
sota Paper Form Board Test included also, two factor analyses are 
reported. The factors found were intelligence (actually reasoning), paper- 
motor speed (a fusion of our aiming and motor speed factors), space, 
perceptual speed, number, dexterity (from the four apparatus tests 
scores), and verbal. No correlations with external criteria are reported. 




28 



1955 INVITATIONAL CONFERENCE 



The SRA Primar>- Mental Abilities for ages 7 to 1 1 is the only multiple 
aptitude test available for use with this age range. The test booklet is 
re-usable, with a carbon-backed aaswer sheet. The accompanying ma- 
terials provide age norms for ages 6-0 to 14-0 at two-month intervals for 
the Ave factor scores and for the four separate tests of the verbal and 
reasoning factors, i>ased on scores of 4,714 children aged 7 to 12, and 
revised on the basis of 2,000-odd additional cases. The norms for ages 
6 to 7 and 12 to 14 are extrapolated. IQ and non-reading IQ estimates 
are obtained from weighted composites of five and three test quotients 
respectively, and directions are provided for computing and using a 
reading aptitude quotient, an arithmetic aptitude quotient, and a 
measure of current reading/ experience status based on the differences 
between scores on the written and picture-oral tests of vocabulary and 
reasoning. An interpretation booklet includes a profde chart embodying 
the age norms, and directions for making the IQ, reading, and arithmetic 
estimates. A technical supplement to the manual reports the reliabilities 
and intercorrelations of the test scores, and correlations with several 
intelligence tests, reading tests, and arithmetic tests. 

The SHA Primary Mental Abilities Test for ages 5 to 7 is the only 
multiple aptitude battery available for use with young children. It is 
entirely oral and pictorial, and Is booklet-marked. The cover page of 
the booklet includes a profile cLart which embodies age norms from 3-0 
to <J-0 at two-month inti'rvals, and a sc*para(e column of age scores fw 
a weighted compcwite total score npn-senting intelligence, basod on 
scores of 1,200 children aged 5 tu 8. The niarmal discusses methods for 
estimating reading readiness, arithniftic readiness, and motor coordina- 
tion. The tei'hniral «upplemej5t n porLs reliabilities and ink rcorrelations 
iimong the scores, and correlations with the Stan ford-Binet, Form L, and 
with several reading, arithmetic, and general scIhmjI achievement tests. 
In some cases the achievement tests were given more than a year later. 
The evidence pres< iiUxl justifies the claim that tills test provides fairly 
valid evidence of first-gradi* readini^s. 

The Hol/iriger-t:rowder I iii-Kaetor Tests appear in two equivalent 
macliine-scorable forms. Krul-of-year percentile norms for each factor 
score are given for grades 7 to 12, and also for the scholastic aptitude 
score. A profile chart is providwl on the back of orM« (^f the three answer 
sheets. Age norms are not given in the manual, but a footnote says they 
can be obtained from the publisher. The norms are based on the scores 
of over 10,000 students in grades 6 through 12 in 38 schools in 28 com- 
munities in 7 states. An IQ talble for the Scholastic Aptitude score is 
provided also, based on ecjuvi-peroMntile equating to the Terman-McNemar 
in a subsample of orvf r 2.00)0. Alternate-form as well as split-half 



^7 



TESTING PROBLEMS 



29 



reliabilities are reported, along with the intercorrelations of the factor 
scores. Correlations are reported between the factor scores and several 
dozen achievement tests, a considerable variety of high school subject 
grades, and seven intelligence tests, but none with external criteria of 
occupational success. Data are also given on sex differences: boys are 
better on space, girls on verbal and scholastic aptitude, and the results 
for number and reasoning are inconclusive, thougb the girb have again 
a slight edge. Practice effect is substantial for space, slight for reasoning, 
and negligible for verbal and number. The essential intercbangeability 
of the two forms is shown to justify one set of norms for both* 

The Factored Aptitude Series consists of sixteen four-page test book- 
lets, and a seventeenth to be used for recording the results of a nuts- 
ind-washers apparatus test The tests are all booklet-marked. On the 
back of each test booklet is a brief rating scale: '*0n a regular job I 
would like to do the kinds of tasks represented by this test: A) As a 
major part of my work, B) Frequently, C) To a moderate degiee, 

D) Only occasionally, E) As seldom as possible. Remarks " 

An interest index for job areas is based on the ratings for the several 
tests, and correlations with the Kuder, Lee-Thorpe, and Strong are 
reported to range from .35 to .70. Reliabilities, intercorrelations among 
the tests, and validities against job criteria are reported in the manual 
and in various published notes, usually without exact data on the sizes 
and compositions of the samples. A complete statistical report is in 
prep aration, but was not available at the time this report was written. 
Stanine and percentile norms are given for the seventeen tests, based on 
samples of the working population of unstated size, but apparently 
quite large. Tables are given for converting stanines on sub-batteries 
into weighted aptitude indices for 24 basic job-test areas. The weights 
appear to be based on qualification standards as well as regression co- 
efficients, and the resulting 20-point qualification levels are interpreted 
uniformly. There is a profile booklet, and a qualification grid booklet 
for computing and recording the 24 job-area qualification scores. These 
tests are intended primarily for use in employment and placement, and 
in consequence the manual and notes are addressed mainly to business 
and industrial executives. Much of the material in them is frankly 
educational rather than strictly technical and descriptive. Professors of 
measurement may be surprised on reading them to discover how much 
of personnel testing theory can be presented in quite practical and non- 
technical language. 

The Guilford-Zimmerman Aptitude Survey consists at present of the 
seven booklets of Form A. Separate answer sheets may be used with the 
testa of verbal comprehension, general reasoning, spatial visualization. 




30 



1955 INVITATIONAL CONFERENCE 



and mechanical knowledge (the power tests), and if necesfiarv with 
spatial orientation also, though this is not recommended. The speed 
testa, numerical operations and perceptual speed, must be booklet- 
marked. Form B is in preparation, and the authors intend to expand 
the battery eventually to some 20-odd tests. A profile chart gives 
C-scores (an eleven-point scale with mean 5 and standard deviation 2), 
T-scores (mean 50 and standard deviation 10), and ceiitiles for college 
men and college women, based on the scores of approximately 2,700 
men and 1,500 women, mostly Freshmen, at the University of Washing- 
ton, Northwestern I niversitN and the University of Southern Cali- 
fornia. Reliabilities, iritercorreialions among the subtests, and validities 
for grades in several dozen college subjects are reported, along with a 
few correlatioruH with occupational criteria, and studies verifying the 
factorial structure. 

The revised General Aptitude Test flattery now has eight paper-and- 
pencil tests and two apparatus tests. Three of the original paper-and- 
pencil tests have been eliminated, and two of the original factors (aiming 
arid moUjr speed) combined. With the exception of mark making, a 
pure speed trst, booklet marked, there are two parallel answer-sheet 
marked forms of each of the paper-and-pencil tests. The statistical data 
reflect the resijurees of a government bureau. The basic norms were 
deU-rminefJ for thi earlier edition on a sample of 4,000 employed persons, 
selected from more than 8,000 U» be representative of the general work- 
ing population in terms of occupation, age, and region, and consisting 
of approximately equal numbers of men and women. The new Form A 
was standardized by equating its standard scores to those of the original 
form on a sample 585 high school and junior college students who took 
lK)th forms. The new Form B was standardized in the same manner 
against Form A on a sample of 412 high school juniors and seniors. 
Aptitudi' score norms are weigh ted-compasite standard scores with mean 
100 and standard deviation 20, with weights which adjust for differences 
in liolh raw-score standard deviations and regressions of the tests on 
the factors defining the aptitud<^s. Seven stu<lies report reliabilities, 
three report intercorrelations of the tests, three report intercorrelations 
of the apliludf^. ten report correlations with college grades, and seven 
rep^jrl correlations with other batteries and tests. Other studies report 
sex differences and increase of scores with age. Occupational aptitude 
patterns cortsisl of minimum scores on subsets of three or four aptitudes. 
This is a multiple cut-off system. Minimum scores are given for 17 job 
families and five ungrouped occupations, based on 99 validity studies 
yielding significant results. A profde card and a card giving the minimum 
scores for job families and ungroupe<l c>ccupations are supplied for the 



TESTING PROBLEMS 



31 



use of State employment oflRces. Unlike the other tests here considered, 
the General Aptitude Test Battery is not for sale, but wherever a 
iegitimate testing need exists which can be met by it, the local office 
of the Slate Employment Service is usually glad to cooperate. 

The six booklets of the Differential Aptitude Tests are recusable, all 
responses beirip recorded on answer sheets, and there are two eqaivaleiit 
forms of each test. The norms are scores or score-ranges corresponding 
to 23 selected percentiles. They are given for each sex and each form 
for grades 8 to 12. and ure based on the scores of over 47,000 pupils in 
over 100 school systems covering every region. A profile sheet is pro- 
vided, with percentiles laid off on a normal-distribution scale, and a 
T-tJCore scale in wjual units beside it, with layout such that one inch 
represents the I percent significance level for the difference between the 
scores on the two tests. The T-scon? scale has mean 50 and standard 
deviation 10. Several thousand correlations are r p. rted between the 
tesLs urui sihool grades, and some of these are suiamarized in charts 
showing the distributions of coefficients for the major course areas: 
English, history and social studies, mathematics, and science. A con- 
siderable number of the studies show correlations between scores and 
course grades receiveil 6 months to 3.5 years later, and still others show 
correlations with rollegi? freshman grades. Some hundreds of correla- 
tions witli arhievt^ment test scores are reported, along with percentile 
equivalents of avt»rag«» test si r^res of students in various college curricula 
and in a do/cMMMld ocrnpational groups. Reliabilities are reported for 
each test by grade and sex, together with two sets of re- test coefficients 
after three years. Mi-fin within-gradi? intercorrelations among the tests 
are reporter 1 for Iwiys and girls, niul correlations are also reported with 
ten other aptitude ti»sts and liatteries. and with the scales of the Kuder 
Preference iUM-oal. 

The Multiple Aptitude Tests consist of nine booklets. They may be 
used with or without separate answer sheets. T-score and percentile 
norms by test an» given for each sex for grades 7 to 13, based on 11,000 
cases from »Mtrht regions. Tli*» T-sron»s an» normalized standard scores 
with mean .lO and slan<larfl deviation 10. Differential intelligence norms 
givi» scon* rang»»s for each test, at I.j seh»et<»d percentiles, for children 
of high, averap'. and hiw inl*»lligenc<\ at junior and senior high school 
Ifvi'ls. by sex. \ profile chart is provided on which tlie standard scores 
of the t«»sts may be averaged to obtain the factor scores. Reliabilities 
of the tests and factors are reported by grade and sex, and intercor- 
relations among the tests by sex for grades 7 to 13 combined. Correla- 
tions l><»twe»»n each t«*st and 15 other tests are reported by sex, as are 
also correlations with l6school siilyccts. and these latter <lata are charh»d 




32 1955 INVITATIONAL CONFERENCE 



to exhibit differential validity. Additional studies report correlations 
with college freshman grades in four subjects at one college and in five 
at another, the latter separately by sex. Factor analyses of the two 
intercorrelation matrices are reported, along with data on the validities 
of the factor scores for scholastic performance. Expectancy tables for 
school marks predicted by separate tests are given for 19 combinations 
of test and subject. No correlations with occupational criteria are re- 
ported. 

The Flanagan Aptitude Classification Tests come in fourteen book- 
lets. They are booklet marked, with carbon-back self-scoring grids, and 
each grid contains a small table for converting the raw scores to stanines. 
There are alternate forms (Form B) for inspection, coding, memory, 
scales, arithmetic, patterns, tables, and mechanics, and the author plans 
to extend the series with additional tests now in preparation. The 
stanines, with mean 5 and standard deviation 2, are based on the 
scores of a representative sample of 1,563 Pittsburgh high school seniors, 
and a supplementary table gives the percentiles for boys and girls cor- 
responding to the stanines. An aptitude classification sheet replaces the 
usual profile chart. Subgroups of stanine scores are added, and a table 
on the aptitude classification sheet gives occupational stanines for 30 
occupations and college aptitude. The selection of tests for each occupa- 
tion was made on the basis of job analysis data and the validities of 
similar tests as reported in the literature. The occupational stanines 
were then computed from the distributions of the sums of stanines, for 
the select/ '(1 groups of testa, using the data of the norms sample. They 
represent equal weighting of the tests in the subgroup, and do not 
include anything in the nature of cut-off scores. Intercorrelations among 
the tests are presented for the data of the norms sample, and reliability 
data based on smaller samples are reported for the tests and for nine 
representative occupational batteries. Validity coefficients are reported 
for seven occupations and three college curricula, based on merit pro- 
motions over a four-year period and grade-point averages respectively. 
More recent materials, not yet in the manuals, give the mean stanine 
scores on the tests for subgroups of the standardization group who said 
they were satisfied and successful in 2.3 occupations and 19 types of 
post-high-school specialized training courses. Percentile norms are also 
reported for the applicable occupational stanines for nine occupations 
and nine trairn'ng groups, and on the basis of these studies minimum 
occupational stanines are in preparation. 

I shall not attempt to review the scattered representation of tests 
fn)m the third category which appear in Table 1. The most important 



31 



TESTING PROBLEMS 



33 



of these is probably the Yale Educational Aptitude Test, the data for 
which appear in CrawfortJ and Burnhniu's Forecasting College Achievement. 

Aside from the data just cited, and those in Table 1, liow do the 
>iiri(uis batteries eornpare with one juHilher.*' There is certainly no single 
answer to this question. All of these batteries are modern in the best 
sfMise of the term, and well constructed. Dollar costs aside, the user will 
tret from any one of them just about as much efTcclive measurement av 
he f):i\s for in testing; time, lie can safely select whichever battery 
ineasnn^s most nearly what he wants to measure in the time available 
to his subjects. 

A few ^'cih ral observations, however, may be in order. It seems to 
me that there is little value in striving for almost-pure factor scores. 
If this fvsults in tin* sarn*' test appearing in two or more factor com- 
posites, and I he factur scons are then used to predict external criteria, 
this test will rec<*ive Mndu<* weight in the larger predictor composite if 
the latter is Iwised on ecjual weighting of the factor scores. If the weights 
for the factor scores jire regression weights, this test will increase spuri- 
ously the correlation between the two factor scores to which it con- 
tributes, thus hjwering their regn'ssion weights. The net result may be 
that thr test itself is properly wciglit(»d, but the other tests in the two 
or more factor composites will De underweighted. This criticism applies 
to the (ietieral Aptitude Test liattery and the Multiple Aptitude Tests. 

If almost-pun* fa( tor scores are derived from different combinations 
of tests, the several tests riieasnring each factor will correlate highly 
with one another. The [)roper objective of any multiple aptitude battery 
designed lor service use is to include tests all of which have low inter- 
correlations but every on(» of which will be a valid predictor of at least 
one category ol important criterion variables. Batteries yielding almost- 
pure factor scores reduce thereby the *Tactorial range'* that might 
otiierwisc be obtaitied in the same testing time. The ACIC Test for 
I'rirnary Mental \bilities. Chicago Tests of Primary Mental Abilities, 
llolzinger-( Towdrr I fii-l actor Tests, ami Multiple Aptitude Tests are 
subject lo this criticism to perhaps a greater degree than the others. 

The value of descriptive norms is a function of their representative- 
ness for some defined group rather than merely of the number of ca.ses 
on which thev are based, lien* all other authors would do w(?ll to study 
the methods by which the I nited Slates Minploynient Service; arrived 
at norms repiesentalive of tlu; general working population. They may 
be as lucky as was tia? \ Sl'.S with its preliminary norms bastrd on the 
first .IM) <Mses whi<h came to hand, but again they may not! Cirade 
norms, age norms, and pen entile norms for gnaips of .specilie<J age and 
sex or sp(M ilied gnnle and sex should mean w hat they say, and samples 



32 



34 



1955 INVITATIONAL CONFERENCE 



carefully stratified on all other major determiners of variability are 
necessary to attain this goal. For some purposes one can question the 
need for descriptive norms. For industrial tests tlie basic norms can be 
arbitrary. If the object is merely to derive a system which will equate 
the scores on the tests, a pick-up sample such as Flanagan's is entirely 
adequate. For such tests the real need is for representative samples of 
workers in the various occupations and occupational groups. No such 
samples have ever been found outside the armed services, so far as I 
am aware, and the task is certainly Herculean. 

The reported reliability coefficients are of little value in comparing 
one battery with another. Most of them are of the split-half variety, 
and a few more have been computed by one of the Kuder-Richardson 
formulas. All such coefficients are spurious to greater or less degree with 
timed tests, the amoufit of spuriousness depending on the severity of 
the time limits in each case. A few authors have even reported such 
coefficients for pure speed tests. The numerical value of a test-retest or 
alternate-form reliability is in part a function of the time interval separat- 
ing the two test sessiofis. CoefTicients of this type are reported for inter- 
vals ranging from cofisecutive administration at one test session to 
three or four years. Kvery reliability coeflicient varies with the range of 
talent of the examinees. The ranges reported cover grade groups, age 
groups, and o<xupation groups. 

Finally, we have with us still in occupational testing, the argument 
of weighted composiU^s rer.stis multiple cut-ofT scores, with, among others, 
the I 'SKS on one side and Flanagan on the other. Far be it from me to 
try to settle this argiiment! 

Table 1 Probable Factorial Compositions 
FAflTOHS OTHER ABILITIES 



V verbal kiumli'iige SP spelling 

F fluenry (verhal. ifjejitionHl. expression) lanpiage iLsaKe 

S spare (iiirl. orientation and v isnaliuitioii) V'D visual diH("rimiuatiun 

D deduction (?<y!lo|nMms) J judgment 

H reajwming (itu:l induetion) MA mathematical achievement 

M mfThanicHl knowledge SI sctience information 

I VI tmmediaU' iiu'niory 

N numb«T ftx ilil v 

A aiming 

MS moU.r s|H*ed APPARATi:S FXCTORS 

P ^Tceptuid HjH'ed (incL rierit iil sjnvfj. f,»rni (not here cuiLsidered) 

perception, and symbol disrriminiilion) linger dexterity ((JAtti) 

CS cUnre H|)*vd ( ,g,.re nnknov^n) „,a„ual dexterity (GATB) 

Ch cUnre llexihdity fifgure given) ,„,,t^, (KacUired Apt. S*tU) 

ExPLANATn.v OK * lU:\lAHKs ': I) i;rts> Hrithmetir ronipntntion is mostly N; hard is mostly R. 
2) C.rosH-<Mit: 1 iif J I li ineritH; on»- Mhieli dtM S not "iMlung" to Inr identified. 3) Identical formH: 
Bpire (^ven. ^nie lignre t4> U- idi'iitilied among others dilTt ring only slightly. Imt still easily 
diacrimmated I) .Surface development: pattern and object (as of sheet- metal). 



TESTING PROBLEMS 



35 



I. ACE Test por Primary Metvtal Abiuties (SRA) 



Test 



No. of 



Items 


Minutes Factors 


36 


> V,R 


60(?)2 


S 


870 


P,V 


30 


R,V 


120 


N,R 


20 


R 


100 


V 


150 


N 


30 


R 


40(?)2 


S 


30 


R 


25 


IM 


60 


P 


20 


R 


44 


M.S»R 


20 


IM 


1653(?) 



Remarks 



1. Completion 

2. Figures 

3. Verbal enumeration 

4. Letter grouping 

5. Addition 

6. Arithmetic 

.3 

7. Some or opposite 

8. Multiplication 

9. Number series 

10. Cards 

11. Number patterns 

12. Initials 

■ J 

13. Identical forms 

14. Marks 

15. Mechanical movements 

16. Word-number 

(Three bookleU) 
Factor scores: data not available to writer 

II. Chicago Te«T8 or Primary .Mental Abiuties. Aces 11-17 (SRA) 



Word from definition 
Slide vs. turn-over 
Pick words of given class from list 
Cross-out 
Fairlv easy 
Problems 



Easy 

Slide vs. turn-over 
Incomplete matrices 
Initials-surname 



Position-series completion 



1. Addition 

2. Multiplication 

3. Vocabulary 

4. Completion 

5. Figures 

6. Cards 

7. First letters 

8. Four-letter wonts 

9. Letter series 

10. Letter grouping 

1 1. First names 

(One booklet) 



r ACTOR scores: i>uroDer (J-h^A 
Reasoning (9-hlO). Memory (II). 



70 


6 


N,R 


70 


3 


N,R 


50 


4 


V 


45 


6 


V,F 


20(54)2 


5 


S 


20(54) J 


5 




80 


5 


F 


60 


4 


F 


30 


6 


R 


30 


4 


R 


20 


8 


IM 


495(563)2 


58 





Fairly easy 
Fairly easy 

Definition given; supply word 

i Slide vs. turn-over 
Write words or four-letter words 
with given first letter 
Series completion 
Cross-out 

Write first, given last 



III. SRA Primary Mental AaiuriKs. Aces 11-17. 

1. Verbal-mi»aning 

2. Space 

3. Reasoning 

4. Number 

5. Word-duency 



.'>0 


4 


V 


20(54)2 


5 


S 


30 


6 


R 


70 


6 


N,R 


70 


5 


F 


240(274)2 


26 





Vocabulary 

Slide vs. turn-over 

Letter series 

Addition, fairly easy 

First letter given, write words 



(One booklet) 

Factor »corea: same as test scores. Total score: V-|-S-|-2R-f 2N-|-F. 



' These data from test bcMiklet only; writcT did not have manual. 

' Maximum score, where different from number of items as in multiple-answer items, in parentheses. 
' Horixontal line indicates end of multi-test booklet. 



3 4 



36 1955 INVITAl iONAL CONFERENCE 

IV. SRA PniMAnr Mental Abiuties, Ages 

Test 



1. Uorda 

2. Pictures 

3. Space 

4. Word-grouping 

5. Figiire-fpouping 

6. Perc;eplion 

7. Number 

[f>l. e iKJOklft) 



No. of 
lU-ms 


Working 
Minutes 


Probable 
Factors 


36 


8 




37 


8 


V 


27 




S 


27 


6 


R.V 


27 


8 


R 


50 


5 


P 


52 


S 


iN.H 


256 


47 





Remarks 



Vocabulary 
Oral vocabulary 
Paper form board 
Cross-out 
Cross-out 
Identical forms 
Fairly easy 



Factob i4U}m: Verbal (H-2). Space (3). Rearming (4-1-5). IVm-ptuul spetrd (6). Number (7). 
V. SRA r^niMARY Me.ntai. Abilitiels. .\(;t:ii5-7. 



1. VerUU-meaning 

2. Perci-V/tual-speed 

3. Quaniitalive 

4. M(>U)r 



l()i>e b<Kjkl*?t) 
Factoh scoies: Same as tent scores. 



49 


• 


V 


30 


1.5 


P 


27 


• 


N.R 


80 


I 


A,MS 


24 


• 


S.R 


210 


•Not timed 





Oral vocabulary 
Identical forms 
(limiting and arithmetic 

Pap<T form lH>urd 



VI. Hoiy.INCER-CROWDEB I 

1. Word nirani *g 

2. Odd >vords 

3. B<K>ta 

4. Ilatchita 

5. .Mixe^. arithnietir 

6. Remr'iaderM 

7. Mixe " series 

8. Figu chakig(*s 

9. Teams 



m-F'actor Tests (Wuhi.d BrK>K) 



45 
45 
70 
70 
hO 
60 
40 
40 
30 



4.5 
2.5 
2.5 

3 

3 



\ 
\ 

S 
S 
\ 
N 
R 

R.D 
D.V 



40.; 



[One b«j<iklet) 400 
Factob scoreH: Verbal (1+2). Spatial (3-1-4). Niiincri.al 
Schoiasrir aj>Utude: 5(1 -1-2) -l-(5 -1-6) -1-3(7-1-8-1-9). 

VII. FAc^.^-.'HKr ApTiTirDK S^:Iu^:s (Indi;st. Psy.). 

1. OHic': Vrrr.H 

2. Sales terms 

3. Scie'itilic t4*nns< 

4. Meck/iiiiral terms* 

5. T(K>Ls 

6. Judgment 

7 Diirerenc«»H* 
^ Nunih«*rH 
Perception 

10. Pr(H:i.sj(>n 

11. F'luency 

12 Memory 

13. Dimension 

14. Parts 

15. Blocks 

16. Dexterity 

[Separate booklt* tsj 



Vocabulary 
Synonyms 

I Slide r*. turnHJver 

Fasy 

After easy division 
|j(!tter and number series 
F'igure analogies 
Syllogisms 



(5-1-6). Heu.stminK (7-1-8-1-9); also 



54 


.5 


\ 


54 


5 


\ 


48 


5 


M 


54 


5 


R 


54 


5 


R.N 


54 


5 


P 


48 


5 


P 


18'i 


6 


F 


36 


2+3 


IM 


48 


5 


S 


48 


5 


S.R 


32 




.S.N 


90-1-120 + 180 


1+1 + 1 


A.MS 



Technical vocabulary 
Technical vocabulary 



\\ hat goes with what 
.Mixed series completion and 
cross-out 

I'jisy to fairly hard 

NauKr and number checking 

Identical forms 

Prelix«?s. suflixes. jobs, oflice 

equipment (write words) 
Names and pictures 
Pick left-rignt reversed picture* 
Paper form board 
Rli>ck counting (AGCT) 
Trace, check, dot 



* These tests not available to writer for examination. 



TESTING PROBLEMS 

VIII. GuiLFORD-ZlMMEJlMAIS APTITUDE SURVEY (SllERIDAN). 



37 





Teat 


No. of 


Working 


Probable 




Items 


Minutes 


Factors 


1. 


Verbal (XirnprchciiHion 


72 


25 


V 


2. 


General re&soniiig 


27 


35 


R 


3. 


Numerical operations 


132 


8 


iN'.P 


4. 


Perceptual speed 


72 


5 


P 


5. 


Spatial orientation 


58 


10 


S,K 


6. 


Spatial visualization 


60 


30 


S,H 


7. 


Mechanical knr)wle<ige 


55 


30 


M,V 




[Seven bfjoklets] 


176 


113 





Honiarks 



Vocabulary 
Arith. reasoning 
Easy 

Identical forms 
Boat-heading changes 
3-dimensional rotations 
20 picture; 35 verlml 



IX. General Aptitude TusT B A TFEHV (USfclS). 

1. Name comparison 

2. Computation 

3. Three-dimensirmal space 

4. VocabuJarv 

— ^ _3 

5. Tool matching 

6. Arithmetic reas^in 

7. Form matching 



1.50 


6 


P 


50 


6 


S 


40 


6 


60 


6 


V 


V> 


5 


p 


25 


7 


H 


60 


6 


P 


200 


I 


A.M.S 


63-1 


43 





Fairly easy 

Surface development 



l<l(Milical forms 

2 st'ts of s(!utt(Tf'd iigiin^s 



B. Mark making 

[Three Injoklcts] 

AfTiTti>K scon'.s: InU-lligt»nce (3+4 + 6). Verbal (4). iNummcal (2+6), .Spatial (3), Form 
c(;ption (.)+7). ( Ii rieal perception (1). M<itor c*M)rdinati<m (H). 



per- 



X. Di FFKHKNTi A L Aptitude Tests (P.sv. Cohi'.). 

1. Verbal n«asoning 

2. iNnmerical ability 

3. Abstract reasoning 

4. Simce relations 

5. Mtrchanieal reasoning 

6. Clerical sp«;e(U accuracy 

7. I^n^nage tisjigt; 

I. Npf'lling 
11. S«nl4Min*s 

[S«rven lMM>klets] 



50 
40 

50 

40(100)2 
6B 
100 

100 

50(95)2 



30 
30 
25 
30 
30 
6 

10 



V.H.D VtThal analogies 

H,N Arith. comp., hard 

H Figure progression 

S Surfactr d(?velopment 

M Mt^chanical comprehension 

P Idf^iitieal symbols 

Sl\\ Single word H-W 

I'.V l^teate errors 



498(603)2 



2 Maxinnmi siore. where dilferent from nnmb«»r of ittMns as in multiple-answer items, in ( ) 
» f tt>ri/.(>ntid line indicates end nf multi-U'st IxKiklet. 



^6 



38 



1955 "NVITATIONAL CONFERENCE 



XJ. Multiple Aptitudb Tests (Cal. Test Bub.). 

No. <if Working 
Test lUirns MinuUw 



Probable 
Factors 



Remarks 



1. Word meaning 

2. Paragraph meaning 

3. Language usage 

4. Routine clerical facility 

5. Arithmetic reasoning 

6. Arithmetic computation 

7. Applied science and 

mechanics 

8. Spatial relations — two 

dimensions 

9. Spatial relations — thn?e 

dimensions 

[Nine bookletMl 



60 

SO 

60(120)2 
90(180)2 



.15 
60 



M0(r>90)2 



12 

30 

25 

6.5 

or 85 

30 

22 

30 

8 

12 



V(X^bulary 



V 

V.H 

L.SP.V^Error location 
V 



n 

H.N 

jM.S.H 

S 

S.H 



Name and numb<:r checking 
Fairly hard 

Mech. comp. and mech. movements 
Paper form board 
Surface development 



Factor sc<irt!s: Verbal comprehrnsion (1+2+3), Perceptual speed (3+4). Numerical reasom'ng 
(5+6). Spatial visualization (7+8+9). 

XII. Flaivaga.n Aptitldk C'.i^\s.siFirATior( Te.sts (SR,\). 



1. Inspection 

2. Coding 

3. Memory 

4. Precision 

5. Assembly 

6. Scales 

7. Coordination 

8. Judgment and romprr- 

heasion 

9. Arithmetic 

10. Patterns 

11. Components 

12. Tables 

13. Mechanics 

14. Expression 



40(155)2 


6 


150 


10 


25 


4 


252 


8 


20 


12 


120 


12 


100 


2^40' 


24 


35 + 


125 


10 


30(60)2 


20 


40 


20 


120 


10 


20 


20 


.52(6-1)2 


35 + 


118(1275)2 


204 + 



PA D Identical forms 

P.R.IM Dimcult 

I Si Memory for code 

A Narrow path tracing 

H.S 3-dimen. paper form board 

P.H.VD Curve nmding 

A Wide path tracing 

V.R IVragraph reading 

N.K Fairly easy 

C.F.S.A Copying designs 

CF Like Gottschaldt 

P.S Table reading 

M.H..S Complex mech. movements 

L.V (irammar and usage 



' Maximum score, where different from numl>er of items as in multiple-answer items, in parentheses. 
^ 8 if Heparate answer sheet is used. 



1^7 



TESTING PROBLEMS 



39 



XIII. Yale Educational Aptitude Test Battery, Form B(ERB). 

No. of Working Probable 
TcH t I tems M in u tes Fac ti irs 

1. Paragraph rea<liiig 

2. Woru reiatioiLH 

3. Synonyms 

4. Translation (Art. I^ng.) 

5. Translation (Art. I^iig.) 

6. Memory (Art. Lang.) 

7. EquatioiLs 

8. Equations 



nrmarkH 



9. Figures 

10. Cubes 

11. Projections 

12. Composite figures 
s 

13. Word relations 

14. Logical inference 

15. Interp. of cxpts. 

16. Number series 

17. S^^mbolic reluti<inHhips 

18. Discovering principh^s 

19. Mechan. movements 

20. Mechan. movements 



40 
65 
100 
8.1 
96 
45 
70 
62 

41 
120 

20 
48 

40 
39 
40 
30 
30 
40 
61 
38 



15-20 
15-20 
15-20 
15-20 
15-20 
17-20 
15-20 
15-20 

15-20 

15- 20 

10- 12 
20-25 

12- 17 

13- 18 

16- 21 

11- 16 
9-14 

31-36 
22-27 
20-25 



One wrong word 

Opposite of diiTereiit part of speech 
V(x;abulary 



V,H,D,L 
V 

R,V,L 
R,V.L 

IM,HAM' Translate without key 



R,MA 
R,MA 

R,MA 
S,R 
H,S 
S,R 

V,D.ll 

D,n,v 

R,D,V 
R 

D.R 

R.D,V 

S,R,M 

n,M,.s 



Algebra computations 
Problem: formula; functional 

change 
Geometry 



Verbal analogies 
Enthvmemes 



Symbolic Hyllogisins 
Functional relations tabulated 



[Two bookletsl 1109 316-411« 

Area (Test) scores: Verbal comprehension (1+2+3), Artificial Language (4+5+6)» Mathe> 
matical aptitude (7+8+9), Spatial Helations (10 + 11+12), Verbal Reasoning (13+14 + 15), 
Quantitative Reasoning (16 + 17 + 18), .Mechanical Ingenuity (19+20). 

XIV. Aptitude Te.sts for Occi;pati<>n.s (Cai.. Te.st Bun.). 



1. Personal-social aptitude 45 20 

2. Mechanical aptitude 60 20 

3. (jen. saUiH aptitude 45 20 

4. Clerical routine aptitude 60 12 

5. Computational aptitude 45 15 

6. Scientific aptitude 45 20 
[Six b<M>kletH| 300 107 



V,J,R Paragraphs 

M,S,R Mixed mech. and space items 

VJM Paragraphs 

P, V.SP Mixed checking, alphabet., spelling 

R,N \rith. comp. and estimation 

R,V,D.S Mixe<l probh'ins 



XV. Encuneerix; a.nd Piiy.sicai. Scik.nck Aptitude Tkst (P.sy. Corp.). 

1. Mathematies 23 15 R,MA Alg<'l)ra Cfimputations 

2. Fnrmiilntwin If) in liMA Algebra problems: set up formula 

S<'i<'nre information 



2. Formulation 

3. Phys. Sci. coniprehensinii 
Arithmetic reas<ining 
Verbal comprehension 
Mechanical cninprehen. 

[One booklet] 



10 
15 
10 

x:\ 

22 
155 



10 
10 
15 
10 
12 



h,MA 
V.SI 

n 

V 
M 



Vocabulary 



XVI. ClEETON-MaS^)N Vf)€ATI()NAL 

1. (ieneral infiirinatinn 

2. Arithmetical rensf)ning 

3. Judgment in t^stiinating 

4. Symbolic relatioiishipH 

5. Reading compreheiminii 

6. Vocabulary 

7. Interest 

8. Typical reactionn 



[One l>ooklet| 



50 

:u) 

30 
20 
25 
45 
9B 
80 

378 



Ai»Trrr»K Ivxamination (McKnujiit) 
> V 

n 

J,V.M 
R,D 



Mst. No. <ifim'nin Navy in 1920, e.g. 
h'igure analogies 
V,l\ Paragraph reading 

V \V<ird-deiinition matching 

)Int<Test and personality factors 
not here considered 



' These data from t(*st b<M>klet only: writer di<l n<it liav<' nianual. 

' llorizimtal line indicati*s end of niulti-ti^t b(NikIet. 

• D^prnding on whether or not practice test was given lirM, 



38 



40 



1955 INVITATIONAL CONFERENCE 



The Logic of and Assumptions 
Underlying Differential Testing 

JOHN W. FRENCH 



Let me start my discussion of differential testing by taking a typical 
practical problem in which differential testing applies. Suppose a student 
has the choice of entering fields A 3, or C, where A, B ,and C are either 
academic courses or occupations. Let us assume that we have given 
suitable batteries of tests to previous groups of students and have fol- 
lowed up those students to obtain a quantitative measure of how suc- 
cessful or how happy the students became in pursuing A, B, and C. 
For this criterion measure of success or satisfaction even a dichotomy 
would be satisfactory. 

Now we are asked by the student which field we would recommend 
for him: A, B, or C. Our choice of the statistical techniques to apply 
should depend on what the student wants to know. He probably doesn't 
know exactly what he wants to know. However, I think we can assume 
that he would like to enter the field in which he would be most happy 
and/or most successful. This means he needs information such as (1) 
his chance of obtaining a certain level of success or satisfaction in each 
field, and (2) his chance of obtaining greater success or satisfaction in 
one field as compared to that in any other field. 

Let me compare two statistical techniques that are recommended for 
developing test batteries useful in guidance work; multiple discriminant 
analysis and multiple regression. 

Those who recommend multiple discriminant analysis in this kind of 
guidance work attempt to answer the student's problem by showing 
him how much resemblance there is between his own test scores and the 
average test scores for people in fields A, B, or C. It is suggested to the 
student that he enter the field in which his colleagues would have test 
scores most closely resembling his own. If the criterion groups for fields 
A, B, and C were chosen from among successful people in their respective 
fields, it is expected that the student will also be successful when asso- 
ciated with the group that he most closely resembles. How successful.^ 
What chance does he have of not being successful? Is he likely to be 
more successful in one field than in another? Multiple discriminant 
analysis doesn*t answer these questions. It is an excellent technique for 
detecting membership in a group, for handling the very elusive problems 
of classification based on qualitative differcnrcs. Hut ii does not answer 



39 



TKSTIiNG PROBLEMS 41 

the question: "How well will I do if I take a job as a dog catcher?** 
Although discriminant aruilysis cannot answer this kind of question^ 
it does hiivv a place in guidance work. It is probably the best available 
method in eases where criterion scores are unavailable or so restricted 
in range that multiple regression would give oidy a distorted picture. 
I will discuss this limitation of multiple regression later. 

Validity coefficients rather than score patterns are the stock-in-trade 
for those who have satisfactory criterion scores available to them and 
who want to give what seems to me to be the direct answer to the 
students problem. This is the multiple regression method. It provides 
predictions which indieatr to the student bis chances for attaining a 
given amount of success in A. B, and C, and dim.Tential predictions 
which indicate his chances for being more successful in one field than 
in another. 

Let us look at the data in an actual ca.se so that we can compare a 
counselor's advice based on multiple discriminant analysis with a coun- 
selor's advice ba.sed on multiple regression. 1 ables 1 arul 2 on the hand- 
out present small portions from each of two larger tables. The rows in 
the two tables represent four aptitude scores: Perceptual Speed (this 
IS mainly .speed in finding given symbols in a mass of distracting ma- 
terialX Mechanical Knowledge (this is a knowledge of mechanical 
techniques and equipment). Carelessness (this is the number of errors 
made on .speeded tests; a high score indicates many careless errors), 
and Speed of Judgfiient (this is the number of simple choices made 
within a short time lifnit; no attention is paid to the correctness of the 
subject s judgments or to the nature of his preferefices). The columns 
in the tables represent groups of vocational liigh school students who 
later became respectively office workers, beauty operators, carpenters, 
and mechanics. T\ui first two groups art; girls, the stjcond two are boys. 
Table 1 gives the validity coefficients for vocation»l shop course grades. 
Blanks occur in the table where the coefficients were non-significant. 
Table 2 gives the mean test scores for the four groups of students. For 
convenience of interpretation the means have been converted so that 
50 is the general mean of all groups and 10 is the standard deviation. 

For the oHiccj worker group. Perceptual Speed and Speed of Judgment 
l(K)k goo<l from the standpoint of the validity coeflicients. Therefore, 
multiple regression woulrl choose office workers who had high scores on 
these two aptitudes. Future office workers also have the highest mean 
scores on these two factors. Therefore, nmltiple discriminant analysis 
would guide into office jobs girls who had high scores on Perceptual 
Speed and Speed of Judgment. Thus, here is a case where both multiple 



40 



42 



1955 INVITATIONAL CONFERfeNCK 



regroision and multiple discriminant analysU would select tho same 
people for the job. 

For m(M:hanic8 the validity coeOirients recommend high mechanical 
knowledge*. c'arefulni»»» (that is» there is a negative validity for number 
of careletM errors), and sloHriess of judgment (there is a negative validity 
for numU*r of choices niude). The means, on the other hand, show that 
the criterion group ol mechanics had high mechonical knowledge, but 
they were the m«j8t cirel(*SM of the four groups and were speedier of 
judgment than th»' t arpenters. This is a situation where multiple regres- 
sion would guide diHen'nl Ujys into mw'lmnics than would multiple 
discriminant analysis. 

For iN^aiiticians and rariM-nters the two methods would also select 
somewhat difTrrent kinds i)( pe<jple. 

Whirh method is the more suitableP l^'t nie reply by asking a leading 
questicn. Do wf want to encourage sp<'e<ly. careless boys to go into 
mei:hani(*s juiit l*4M iiu>r fneihnnirs are sptM^ly and careless now, even 
though sjm'imI anti nireli^sn«*sM correlate nrgatively with performance 
r^itingH? 

I have lrii*<l to [xwi out how two th(*oretical models for diilerential 
tettiiig lire relattMi to the practical prohh^m of counseling. The multiple 
regritwii ttrhniqiit^ when made possible by the nature of the data 
seem to more suitablr at least in view of the kind of discussion I havr 
lieen ad>aneing. I^»t me now turn to a discussion of some of the theory 
)>earing upon the a< 4 tiniry and the limitations of predicting amount of 
succes8 by the multiple regrt^sion tcrhniqucs. 

Therw are two ways for mtuisuring the efTectiveness of diflTerentiai 
testing that make pretl> g<HKl wnse to nie. By inspecting the equations 
involved it is possible to understand what things need to be maximized 
or minimized to attain lhi» most accurate discriminations. 

Paul llontt has develop#Ml a rmmU»r of general formulas in this area. 
William MollenkopP has w^ir^j^^l out a formula for the validity of a 
battery in prtniieting a <li(/r*»rerM'r Llween two criteria, a and b. This 
fornnila is Formula I on llhe hfividout. Hj.^ is the validity of the dif- 
ferential prefiiction, that is ilu- correlation betwe<Hi d., the predicted 
difference, and d, the obetcrvwl difference. Stars in this notation mean 
prtdidtd. is the validity of the battery for criterion a, and Rb«b ifl 
the validity for criterion b. r.«b« is the correlation between the pre- 
dicted criterion scores, and r^ is the correlation between the observed 
criterion measurni. 

^MotXRWoer. W. I*mlu ti*<i clifTrrrtM^ ami tiifTrmimi l»etwr«n prvdicticiiM. 
PrythomHrtkn. IQ50. f.V 409-417 



TESTING PROBLK.MS 



43 



It is clear from the equation that the validities for the two criteria 
Hhould be high. r.^,. the rorrrlation lH*iween actual criteria, depends upon 
what particular criteria ur« involved and so m not in the experimenter's 
control. The critical point for Mollcnkoprs e<|uation is that the cor- 
n lation lx-lv*ri ii preciictions should U' as low as possible. I^t me trans- 
late this (Jernaiid of lht» i^jualion into terms of diri»ct interest to the 
convSirncter of Ihi- test battery Ia I us supiM)se that each test in the 
hatti-ry had the s^inie validit> for criterion a as it had for criterion b. 
For example. sijp|M)se we are trying to disrriminale lM twi»en plumbing 
and car|)enlr>. Perhaps a nuM-hanical t#*st has :i high validity for both. 
Let's s*iy a viTbal test has :i low validity for l)oth. Then the same tests 
and siinie wi-ighls would he us<mI j pnxlict sue c«»ss in lK)th plumbing 
and car|K»nlry. The pre<liclions for any one person woidd be exactly 
the siimi'. r..b« Wfiuld l>e 1.00. and. accf>rding to Formula I, the validity 
of (lilffTiMilial pri'dii tion windd 1m» zero. On the other liand, if each test 
has ^1 > er> dilfireni validil > for phmihing from what it has for carpentry, 
thi» pri»<lielit>ns for the two criteria will |h» madi* on the lui^xs of different 
li»sls or very (iilferiMilly weighlisl tests. Tlie corrcia rt.)n> hn'tween pre- 
dictions, r..b«.\^ill Ilea mininnnn. That is. it is a critii ;il m;/uircment for 
each ti^t to have different validities for the (lilferent criteria. This dif- 
ferential validity is more liki^ly to (H-cur if the tests in the battery are 
highly independiMil one from another. I s*' of pun^-factor tests or factor 
Hcori»s is one way to hiMghten chanees of reaching this goal. The validity 
ciM-flieients in Table I on tlie handout indicate that here is an instance 
whiTe Honii* surress was* attained in lintiing for eaeh t<*st widely different 
validities for thi- different criteria. 

It is |M»rhaps wis^* to remind oursi»lvi's lii»re not to lose sight of the 
fict tliiit giMMl trt-neral preiliction is also ns4-ful in ccaniM-ling. That is. 
thi» slndrnt not unU wants tu know in wliich j<ib h»» will <io In^st, but 
hi» also wants to know Ih»w wr|| he is liki-ly to do. Onr should, therefore, 
eon^ider the int lusion of some fiighU v;di<l tests of mixi^l factorial 
rontefit. rie rr is :i reiil danp-r of losing high general pre<liction when 
one is trving to » hiird to grt ginnl dilfen-ntial pretiiction. 

\n(>ther w.iN of jiidtring th*- i-flertivi'ness of differential prediction 
tinit makes ^:.h«| si^nse to riie >\;is first d^Ncrilnwl b> T. L. Kelley' and 
later develu|..d by S.-ct-P and U\ I^emiett atld Doppelt*. Suppose two 

*S>:i.rt. |> Ihfrrrrttitif fhtujimtia WMxwu.tf . \\Hr>»irk nml York. I>m 

*IU nn>t-t. (t V . %.M» I>Mer»:i 1, J K 1 hr t \fi|ijnti<»ti of jwirs i»f \e%U for fftiiHiin(*e 



42 



It 



1955 INVITATIONAL CONFERENCE 



p<»rs<>ri8 starnl iit exartly the Siirne level on s*>me aptitude. When these 
two pe<jple lire t(*st«*(l for this uptitude hy fullihle tests, there will be 
a di(Ten'rne Ix'twiTH tlie scores they receive. If tlie testing is done 
repeatnlly. u dislrilinlion of dllFerenres will <'Volve. Tliis distribution 
of dilFerenres rn;iy lie s;ii(l to be «*ntirely nllrilniliible in eluinre. sifice 
there is jirlually no dill'tTfrire U'tween the jiptitude levels of the two 
people. In the ras«» wImtc a real dillerence in aptitude level (Uh^s exist, 
the observed dilFerenres in scores will hv j;realer: they will be partly 
nttriliutable to chanc** and partly a relleclion <jf the real dillereuce in 
aptitude level. The rlTerliveness of dilFiTeiitial t(*slinp can be slated in 
terms of the projMirlion ofobs^Tved dilFerenc'es that are not attributable 
to rhanc(». If the two variables in (|uestiori are highly related, the real 
dilF«'renc»'s will hr small. Therefore, the proportion that is not accounted 
fur bvd jaiice will Ih' low. If the two variables iire relatively independent, 
tlif n-al dilFrrences will lielarp*. If the tests are hif^hlv reliable, the chance 
Tenci*s nill b' small, and the proportion not accounted for by chance 
will In* hi^di. 

Vftr <'<>mpiilini; the projMjrtion rial account<Ml for by chance. Kennett 
and l)op|x>lt presented an easy-to-use nomograph. Kelley present^nl a 
lablr >ieldinK the desin^l proportion when entered by Formula 2 on 
tlu" handfuit. In this value the numerator ^^ives the stan<lard error of 
dilFerencf's <'a!is»Ml b> the nnreliability «)f the tests, and the denominator 
Kives the over-all standard error of dilFerences found l>etween test sc<ire8. 
In the e(|uation Mn and Msii Jire tin* reliabilities of the tests, and Hij is 
the correlation b'tween the test s<ores. 

While this formula Has worked out for pairs of individual testes, there 
is no n'uson why it camiot be applied tt) pairs <if t«'st batteries. When 
He are interi'ste«l in prediction, the batlrries ushI for two criteria will 
usuallv o\rrlap. because one f)r more <if the tests are lik<'l> to Ih' valid 
for lH»th criteria. The correlations iM tneen the pn^lictijins for the tw<i 
criteria are likely, therefore, to Im' hi^di Tlu' <orrelntion bf»tween pre- 
flirtions is analo^His to the hj. in the formula, flie formula shows that 
il is eriti( al to keep this correlation <lown. This. <'an onl\ Im' (h>iie by 
ha\iui: rehitively indepen<l«'fit tests wei^rhtetl as dilTerentU as possible 
in the predietion equations. This means that here again each test must 
have niflelv dilFereiil validities f<ir dilFerent criteria. 

There is one very disturbing matter that seems fitting to <liscuss in 
conrurtton with the foregoing remarks alnMil highly dilFerential validitij-s 
arul alnuit the choice ix'tween nudtiple ri'gressioii and undtiple dls- 
(Timinnnt analysis, ft is something that tends to In^fuddle the nudtiple 
regression approaeh to difF<>rentral pn*diction. 

t3 



TESTING PROBLEMS 



45 



I.rt's Hiiy tire trying to pn^lirt siirr<»ss us a mrchanie. In virw of the 
rorn'lalions appenriii^^ on thr hiiiidout the n'j;rrssioii tM|iiatioii for this 
prt'dirl ion wiil inrlude a ronsiflerablf wrij;htinj; of Mwhuriical Knowledj;e 
and a sinjillt-r nr^jativr wri^ditin^' of Carelrssness and Speed of Judf?mont 
(ar jMisitivr wri^^hting of ([arrfuhirss and Slowness of Judgment). Now 
let's sapjKisr that a hypothetirai fartor X was also, for some obvious 
psyeholo^Mcal rrason. uhMjlulrly essential for mwhanies, so essential 
that all meehanies need it in a high drf^ree. This faetor X might lyc some 
sueh thing as a willingne»ss to g«'t all messed up with dirty grease. Tho 
range of si ores on faetor X would be at a high level and very restrielrd 
in extent. This would make the observed validity coeflieient for fartor X 
low. perhaps so low that faet<jr X would not enter into the predietion 
eipialion for meehanies at all. Suppfise we us<h1 only the factors with 
high validilirs to make our prrdietioiis. Then we might pn»<lirt that a 
eertain studtMit would do well as a nierhanic, because he is high on 
Meehanieal Knowlt»dge iind low on C'an»lessness :uid Speed of Judgmi»nl. 
\«'V« rlh«'h»s.s. he might fail eoinpletfly . because he lacked fartor X. 

This kind of error can be avoided in *Mther of two way^. One way 
would be to apply a special cutting score in ras(»s of variables like 
faclor \. For example, a sludeni would be given no pre<liction for suc- 
cess as a mechanic unless his factor X score fell within the range which 
the criterion group of mechanics had for factor X. That is. unles,s tlie 
•-ludrnt is willing to get messed up with dirty grease, you don't predict 
his sui cess as a mechanic at all. If his factor X score was in the proper 
range, his success in mechanics would then be properly predicted by 
the regression e(piati«)n computed from uncorrected vnlidity roeflicients. 
J- or >uih individuals whose fartor \ .seores wen^ already known to be 
within this high range, the amount of factor X pf).ssess*»d by them might 
Im' sullirit nt ffir sucri'ss as a nH-^ hanic, and therefore not ini|M)rtant in 
predirting amount of success. That is w hy the low validity coenicienl of 
factor \ would Ih» appropriate, provided factor X was usi-d separately 
lo rliminate those whose scores on it are low. 

The (;\Tn takes this matter into aeeoimt through the rules it uses 
for S4'leeting th*' ' key aptitudes ' u|)on which the qualification of indi< 
\ifhials for jtfbs is based, \uiong these rules are the provisions that 
aptitutles should be eonsidered as '*key aptitudes** for a particular job 
if the mean s( ore for people in that job is high relative to the mean score 
of the general {Hipulation and if the staiulard deviation of the s<x)n»s for 
p**ople in that j»)b is low relative to that for the general population. I»y 
st lertiru: "key aptitudes** in this way, (lATB is giving extra weight to 
the aptitudes wlii< h are thought to \h- so im|K>rtant to a job that their 
range of scores for |wM)plf on tin' job is high aitd restricteil. The add<Ml 



4f 



46 



1955 INVITATIONAL CONFEREINCE 



weight given to such aptitudes will quite properly tend to offset the 
lowering of the observ ed validity coefficient due to restriction of range. 

Now let 8 examine again what we are really doing when we use a 
variable for guidance just because its mean is high for a particular 
criterion. Let s also examine what we are really doing when we correct 
for restriction of range. In the exampk^ I mentioned it turned out that 
nierhnnirs have a high mejiri in carelessness even though the criterion 
valutas correlate negatively with carelessriesi*. If we guide students into 
m<*rhanics just b<^cau8e they resemble our criterion group of mechanics, 
wr an* assuming erroneously that it is good for mechanics to be careless. 
Let's sny we have found that some people who tried to be mechanics 
hut could not make the grade were low on factor X, This would show 
factor \ to have positive vuH.Iity even though validity coefficients may 
not have rpvt-aled it. Or piThaps there is some psychologiral or practical 
n'as4)n that makes it logirally apparent that mechanics should be high 
on factor \. If either of these things is so, it would he reasofiable to 
LTiiide int(» the mechanical trades only those students who were high 
on factor \. 

Now take the rase where we do not have an independent study show- 
ing the validity of any aptitude with restricted range and do not have 
any partimliir psychologiral reason for being sure that high scores on 
any iiptitndt,' are necessary for niechanics. If restriction of range on 
ans one aptitudt* is extreme, we iraist, as I rnentioned before, limit our 
pnilirtioiis based on thai af)titude to persons whose scores fall within 
th<» rerstrii ted range. If, on the other harak restriction of range is, say. 
no grcjiter than 50 per rent, it is jM)ssihle to use the known range for 
rnrrhanirs arni thr known range for thr total population to correct the 
oliliiinrd validity for restrirtion of range. When the corrected validity 
l oenirient is used, the aptitude with a restricted range of scores should 
tnke its proper weighting in the regression equation, and any student 
whethtT within th*» restrirt<Nl range or not can be given a pnHliction as 
to the amount of surress he conld expect if he entered mechanics. 

This is ill very sntisfjirtory if th'* regfi'ssion is linear. However, if 
there are lui merhani<>i with low srorcM on factor \, we will not be able 
to tell whether it is linear. The lower part of the sratter plot of factor X 
srores versus nu^i hanirs rritrrion values does not exist. Linearity in this 
lower part of the scatter plot cannot he proved, but nuist be assumed in 
order to extrapfilate the regression line to accommodate sludcnt.s with 
hiw vnhti^ of factor X. If restriction is not more than 50 per cent the 
Hssumptiijn is probably not more dangerous than many of the assunip- 
(ions we have to make in the field of tilting. However, some accuracy 
of prediction is lost by having to extend the regression line out beyond 



TESTING PROBLEMS 



47 



the range which served to locate it experimentally. Not only do sucli 
predictions of scores sufler from the usual error variance of the dis- 
tribution of actual scores above and below the regression line, but there 
is also error variance resulting from errors in the determination of tlie 
slope of the regression line. Such errors become increasingly serious as 
the predictor score recedes from the m«'an of tin- criterion group. 
Snederor^ gives the formula for this variance. This is Formula 3 on the 
handout. The sepanit*' error variances are additive. The "1" in the 
parentheses is the usual error variance around tlie regression line. 
•*1/N" represents the error in locating the mean through which the 
regression line rrurst pass, and *'X2/ IW/' repn-sents tlie error variance 
caused by errors in the slope of the regression liru*. 

How serious a reduction in the accuracy of prediction is this? If. for 
example, the range of a predictor is restricted 50 per cent because the 
criterion group consists of very high scoring people on the predictor, n 
few students asking for guidance could be as far as eight standard 
deviations from llie mean of tlie criterion group. Although this would 
be extn-me. let's find out what the at ( iiracy of prediction would be. 
With lOO rases \2/-\2 would e(|ual .61. 1/N would be .01. The error 
variance, then, would he 6.1 per vrui higln r than the error variance for 
casi'H near the mean. I'lie standard error of the predictions would be 
29 per cent higher. This is enough to be considered, but is not very 
serious even for extreme cases as long as restriction of range is not over 
50 per cent and as long as there are a reasonable number of cases in 
the experiment. 

Again and again it s»'ems that there is not one best method for doing 
something. The method depends upon the practical purpose. If a student 
wants to know how wi^ll he will succeed if he g(K»s into mechanics, you 
should tell him how much he resembles the typical mechanic only if 
that is all you are able to tell him. Otherwise tell him what he wants 
to know. Kstimate his likeliluKKl of attaining a given amount of success. 
If a predictor has a n-strlctiMj range for some criterion, dont correct for 
n^trirtiori of range if you consider people outside the range to be un- 
qualified an>way. but do correct for restriction of range if you want 
to get the U»st prediction for people out.side the range. The statisticians 
and psychonietricians offer us an impressive inventory of fonnulas from 
which to ch<x>se. However, this do*»s not alway.i make the choosing easy. 
For nie. I think it s like being a little boy facing the horrendous problem 
of choosing exactly the right piece of candy from a great big box. 

*SNKi>KM»n. (; NN Shtti.ittnif MHhmh \iim*n. Immii: lnwn Stall- (^iljrire iWs 
im. p. 120 ^ 



4f{ 



48 1955 INVITATIONAL CONFERENCE 



Table 1. Validity of factor scores for job training criteria. 






uuice 


Beau- 


Car- 


Me- 




Workers 


ticians 


penters 


chanics 


Perceptual Speed 


46 








Mechanical Knowledge 






39 


36 


Carelessness 




33 




-27 


Speed of Judgment 


31 


37 




--23 


Table 2. Mean factor scores 


for students 


y,lnj entered the four jobs. 




Office 


Beau- 


Car- 


Me- 




Workers 


ticians 


penters 


chanics 


Perceptual Speed 


58 


52 


47 


47 


Mechanical Knowledge 


39 


39 


55 


58 


('arehissness 


(8 


50 


48 


51 


Sp<*e<i of Judj^inerit 


53 


51 


48 


49 



f'ormula 1. MollenkopFs ft)rmula for the validity of the prediction of 
a dilFerence. 



Formula 2. KcUey's forrntihi for a value used in obtaining the propor- 
tion of diirerenr*»s not accounted for by chance. 



M 2 - 2K„ 

Formula 3. Sntnlecor's formula for the standard error of a pre<liction for 
pr(*dietor scores not close to the mean of the criterion group. 



S; = c^^d + 1/n + XV2:X2) 



GENERAL MEETING 

Communication 
of Test Information 



49 



48 



ERIC 



TESTING PROBLEMS 



51 



Helping Students Understand 
Test Information 



JOHN W. GUSTAD 



The past fifteen years have seen developments in most branches of 
science and technology which even their greatest apologists would have 
felt to be impossible. Psychology in general and testing in particular 
have been in the van of these developments. Testing is quite a bit bigger 
business than it was when Wolfle (22) rendered an accounting just 
under ten years ago. While comparatively few Americans will, in their 
lifetimes, encounter psychologists directly, vast numbers will encounter 
tests. This will occur in school or college, in the military, in industry, in 
hospitals, clinics, or prisons. The chances of an individual's avoiding 
testing are rapidly approaching his chances of avoiding finger-printing, 
having chest X-rays, or paying income taxes. 

There are numerous highly verbal critics who see or profess to see in 
this movement portents of the brave new worid or of 1984. Zealous 
advocates are equally sure that God's in His heaven and all will be 
right with the worid as soon as testing is applied to all human relations 
enterprises. As usual, the truth probably lies between these poles. Many 
psychologists are deeply concerned that test construction has lagged 
behind the rapidly developing science and that a technical product is 
being marketed in the name of psychology which does not represent 
the beat thinking available. There are undoubtedly good reasons for 
these and other concerns. Growth spurU often bring with- them some 
loss of coordination. 

One group from which we have heard comparatively little but whose 
reactions should concern us greatly is made up of the rapidly growing 
pool of people who have been tested. These consumers have opinions: 
they also have money and votes. SiiUT w«» pn»r«»ssionals do most of the 
writing, the ideas of tin* rnnaumers h«'ivt» not been well represented in 
the literature. The situation is especially critical in counseling and 
clinical psychology, for here much of the process rests on the assumption 
. that the client will he willing to make use of information about himself 
derived in part from lesLs. 

The vision which Parsons (16) incorporattKl in his book neariy half 
a century ago Ls l)ecoining dim. Tliere are good reasons for this, because 
his simple, three step scheme was somewhat too simple. Nevertheless, 



49 



52 



1955 INVITATIONAL CONFERENCE 



the general notion that one should analyze the individual, analyze the 
job, and match the individual and the job can still serve a useful pur- 
pose. When Parsons wrote his book, methods for individual analysis 
were few in number and crude in character. Today, a glance at Buros' 
lat(?st volume (5) might be taken by some as prima facie evidence that 
there were more than enough analytic methods available. I doubt that 
runny of as would accept this verdict whole-heartedly. Still, among the 
thousands of tests available, there are some whose validities and re- 
liabilities are respectable enough to make them useful. 

When tests are used administratively, as in the military establish- 
ment or in industry, administrators must consider public relations. 
Most will recall the furor associated with the introduction of the Selective 
Service Qualification Test. In the counseling situation, where client 
rapport is even more critical, where the usefulness of tests is measured 
or should be measured— in terms of the adequacy of the decisions 
made by the client, we encounter problems striking at the very core of 
our operation. The opinions of clients are not known with any degree 
of accuraey; among counselors nnd clinicians, the dissatisfaction, the 
malaise, the gnawing uncertainty are acute. 

Why, one might ask, can one not interview a client with a vocational 
choice pnjblem, assign a battery of tests, give him the scores, and then 
expect that he will act as appropriately as the situation allows? This 
modivi of)erandi wns and perhaps still is in some quarters — in effect for 
a long time with, it should be noted, not entirely bad effects. Yet most 
of us share some of the acute dissatisfaction with this approach. 

Our colleagues with the well thumbed volumes of Freud s collected 
works on their shelves have pointed out that such procedures ignore the 
farts of life regarding motivation, cons<'ious and unconscious. People, 
even college sophomores, have motives. W^irse, these motives are dy- 
namic, whatever that means. S<»nietinu'S, clients will not do their best 
on our ivHlH. Most tests presume the pn»senee of the old college try. On 
personality and int(Tesl tests, <'lients will s<»metimes lie to us, to tliem- 
srlvrs. or to b<»th. Kvrn if they do nut lie Very much and if they do try 
Ui answer the items to the best of their abilities, they will often refuse 
to brlirve or to ;u i on the results of the tests they have taken. Anyone 
who has ever tried to convince an aspiring pre-me<lic that he just do<^ 
not have the ability to make it, especially if a favorite un<'le once patted 
him on the head and tokl him he wns a real smart boy, will know what 
I mean. Most p<»rverse of all, many will finally acquiesce on the surface 
but will, once outsi<le the coun.selor's olTice, go on doing the same old 
maladaptive things, Ik? that trying to get into medical school with a 
tenth percentile \C]\ score or trying to get through engineering school 



TKSTING PROBLEMS 



53 



with ail equally low s<we on lh<; KiigiiietTiiig and Physical Sciences 
Aptitude T«*st. 

It serins to me that thr» prohlrm riiay be considered from two major 
points of view. Kirst. we miKht well exatnine the elieiit and especially 
the task whieh our t«*sls have srt lor hiin. Second, we might consider 
t«Mhni(|iies used hy eonnsriors and ( linieians in trying to help th«» 
elient eornph-te his task sueeessfnily. The client's task is. to a consider- 
able evirnt. deleriniiM'd by the psychologist, and 1 would like to turn 
first ol' uW to this aspect of the pnibh-r!!. As scientists in more or less gocKl 
slandiiig. we shan* a passion for preeisiou and accuracy. We sometimes 
share Ihi: feeling of Samuel Butler who siiid, '*! do not mind lying, but 
I fiate inarenracy." The language of nnmbers is rather natural for us. 
and sometimes it is pnxluetive. Mon-over. w e have a passion for speaking 
srientilically. whieh often means that we rover our tracks with quali- 
lieatioiis so extensive and intricate that even we are sometimes in doubt 
about what onr colh'agnes really arc saying. I'.seful and proper as the 
language of numbers and standard errors is, it is not the language of the 
clients with whom we deal. Vet the process goes inexorably on with ns 
following the currents in our science and drifting farther and farther 
away frorn the consumers of onr technology. 

liinct set out to measure in(ellig<«iice. .Most pt»ople think they know 
what intelligence is. Bef(»re he got very far on the way, Binet had intro- 
duced a strange nr\v com ept: mental age. Stern, searching for a metric 
by means of whieh to express this characteristic, put mental age into a 
ratio with chronological age. innltiplied the whole melange by 100, and 
came up with the 1. (). This has become after forty years a househohl 
term, but by now most of us doiibl its value and for the most part leave 
it out of our test devclofunent enterprises. Yet notice how far from the 
client's univnse of disnnirse the first widely us(^d test got and in how 
short a time. 

'I*he siuiie p:i(ti»rn may be seen in the development of personality 
tests. \\^M>d\^or(h set out to acronjplish a fairly straight-forward task: 
to sort out netirtiiics. Most people have some idea about neurosis. At 
bsisl. (Ii. y think il is a bad thing that has something to do with the 
personality. Perhaps this is enough. But what has happened in the past 
Ihirly-live \c:irs? Introversion-extra version tests were developed. By 
the time tb- « ferrns were becoming dimly understoiKh dominance and 
submission li'>ts were the thing. Current tests hx^ate the client on 
continiia such as psN( liasthenia. Vil and CK, D and Dd, W, rathymia, 
K. F. anxiety, repression, etc. How productive these particular traits 
are is not at issue here. The point is that we have, in gropnig toward a 
better understanding of personality, departed a great distance from the 



OJ 



54 



1955 INVITATIOf^AL CONFERENCE 



language of the client. It w, <if Of-^urse. true that some of these tests are 
not meant for the client's perujial but only for the counselor's edification. 
Nevertheless, the pro'vf.:*m reiv.ains in many instances. 

Some years back, Faterson and his colleagues in the Employment 
Stabilization Research Institute attempted to extend the psychograph 
principle. The o«:cupational ability profile, while something of a mis- 
nomer, nevertheless represented an attempt to make test scores meaning- 
ful to counselors and clients, to come to terms with the dictum about a 
picture and a thousand words. The usefulness of occupational ability 
profiles for the personnel man has been fairly well demonstrated. Con- 
siderably less has been said about the client's problem of trying to learn 
about himself from the inspection of such profiles. One of the few 
thorough treatments is that of Rennett, Seashore, and Wesman (1). 
Profiles are still very much with us, but the more expert the counselor 
or clinician, the more he sees or professes to see in the relationships 
among the points in the profile. Clearly, profile analysis as it is usually 
practiced is not for the college sophomore. Parenthetically, I am some- 
what intrigued by the difi'erent treatment afi'orded to profiles of ability 
scores and those of personality or interest scores. In the latter case, the 
^'♦»^rpretations often Iwrder on the mystifying. The MMPI, Strong, 

< ; f<-schach seem especially vulnerable. Except for some attempts 
VrUh the Wechsler, I know of few instances where people have become 
particularly ''dynamic" with profiles of ability scores. This leads me 
to wonder whether we arc missing the boat in interpreting ability 
profih's or whether the interpretations of personality and interest pro- 
files represent rather stupefying metaphysical leaps. Only time — and 
gmni criterin— will tell. 

There is another line of development which has perforce contributed 
to the present dimculties, Rinet worked hard to measure a global trait, 
intelligence. Other test constructors followed suit with tests of neurot- 
icism, adjustment, mechanical aptitude, etc. Increasingly, there has 
been a terKleiicy to try to measure pure traits. This has arisen largely 
as a result of the developments in fartor analysis, I happen to be among 
th<)se who believe that this line of endeavor will in the long run pay olT 
with better tests and better descriptions of human behavior. The 
problem with which I am concerned here, ho\/ever, is the intelligibility 
of test scores to the client. I would like to repeat here a notion I first 
expressed several years ago (14), namely, that the difficulty of test 
interpretation is inversely related to the counselor's understanding of 
the trait measured and to its pretliclive significance. It seems unlikely 
that we should try to give all of our clients a short course in psycho- 

3;^ 



TESTING PROBLEMS 



35 



metric and factor theory so that they will understand our tests. This 
task is hard enough with graduate students. 

It would be possible to go on at considerable length documenting the 
difficulties which a developing test technology and theory present to 
clients and counselors, but I hope that the point has been made ade- 
quately. We are in somewhat the same situation as the physicist who, 
when asked to describe a chair, quite accurately states that it is largely 
made up of empty space crisscrossed by wandering atoms. Such an 
answer is of comparatively little use to a person who wishes to know 
whether or not he should sit down and, if so, what the consequences 
will be. I am certainly not proposing that we return to the measurement 
of the old, complex, global traits like mechanical aptitude and general 
intelligence. I am, however, suggesting that we have created a con- 
siderable gap between the client and his language and our tests and 
their language. Parenthetically, and related to this same area, we miglit 
do well to consider the ;>roblem of validity. I sometimes wonder how 
much rapport we lose when a client, trying to decide between medicine 
and engineering, takes an inventory which asks him whether he would 
rather be a motorman or a conductor. We are all aware of the predictive 
validity of such items, but clients are not. Perhaps something more 
nii^ht be done following Gulliksen's distinction (12) between intrinsic 
and correlational validity. 

Turning now to the other issue, the counselor's methods, there has 
been growing for the past several years the feeling that our methods 
of introducing testing in the first place and of interpreting tests in the 
second have something to do with the problems we face in getting tests 
and test results accepted and acted upon. The general tenor of the 
arguments presented by Rogers (17) is too well known to need repeat- 
ing. Among those happier with the use of tests in counseling, Bordin 
and Bixler (4) proposed that the process of test selection be considered 
an integral part of counseling, not an intruding element. They went on 
to suggest that the identification with the process achieved by encourag- 
ing the client to participate was worth any difficulties it might create. 

The subsequent work of Seaman (19) and Dressel and Matteson (7) 
provided some substantiation for the ideas expressed by Bordin and 
Bixler. Seaman was interested in whether, in a permissive situation, 
clients would select appropriate tests in sufficient number. He con- 
cluded that they tended to do so. D '!ssel and Matteson went farther 
to study the effects of such involvement in the choice process on some 
outcomes of counseling. They found that client participation was posi- 
tively related to improved self-understanding and to greater feeling of 
security in the choice made but not to satisfaction with counseling. A 



56 



1955 INVITATIONAL CONFERENCE 



study recently completed at Maryland bears on the same point; dis- 
cussion of it will be postponed until later. 

With respect to client participation in test interpretation, much the 
same situation obtains. Bixler and Bixler (3) proposed that such par- 
ticipation would have salutory elFecU on counseling. Several studies 
provide partial substantiation. Dressel and Matteson (8), in another 
study, concluded that students who participated most gained cor- 
respondingly in self-understanding, in security with respect to the 
choice made, and in satisfaction with counseling. Kamm and Wrenn 
(15) concluded that client acceptance of test information was best when 
several conditions were met: first, when the client and counselor were 
completely at ease; second, when the client took a positive attitude 
throughout counseling; third, when the client was ready to respond on 
the basis of the new information; fourtli, when the information pre- 
sented was directly related to the client's problem; fifth, when the 
information presented was not in conllict with the client's self-concept. 
Kamm and Wrenn seem to be describing non-defensive clients. These 
are certainly desirable, but the techniques for reducing defensiveness 
are somewhat difTieult to isolate. 

Taking a slightly different tack, Rogers (18) compared two methods 
of counseling, one of which encouraged client participation, the other 
of which did not. He found no diflerences between the groups handled 
by the two methods, but he did fin<l that higher level intelligence and 
more active client participation in counseling were related to better 
outcomes. 

Intrigued by some of the same problems, we recently completed a 
study (13) at Mnrylund, conduete<i under a contract with the Office of 
Naval Research, dealing with dill'erent methods of test introduction and 
test interpretation and their elfects on client learning as a dependent 
variable. Very briefly, we selected three methods of introducing and 
selecting tests, four methmls of interpreting test results. The dependent 
variable was a discrepancy in<lex employing differences between self- 
ratings and tested positions. The discrepancy index was a<ljusted for 
initial accuracy so that clients who showed high accuracy on pre- 
counseling ratings would not thereby he penalizcxl in post-counseling 
ratings. Test intro<luction methods varie<l from extremely permissive 
to quite directive. Test interpretation methods included the use of 
profiles, verbal descriptions without visual aids, and two methods em- 
ploying the clients' initial ratings which were compared with test scores. 

Neither the rows nor the columns, introduction and interpretation 
methods, were related differentially to the dependent variable. Equal 
changes were observed for all groups. Moreover, the interaction term 



TKSTING PROBLEMS 



57 



between irilerpretation and intr<Kluction was not significant. These 
results are in rhjsc agreemf* . with those reported by Singer and 
Strlllre ^20^ 

In ciinrKu tion with i\\v siimv r*'s**iir«*h project, Tuma (21) undertook 
to st*;dy frrlijin j)ersonality rh.'iriicterislics of pairs of clients and 
ronrisehjrs as th<»S4» might hr relalrd to the dependent variable [lis 
n-sejirch follow«*d the genrral line laid down by Hedler (9) (10) (IJj. 
He found some rrlationships existing' wliirli sugg«.»st«.*d that metho(Js as 
sueli, taken apart from the pers(jnulilies involved, are perhaps not the 
m<ist fruitful variables for siu(Jy. He f<jiind, for instance, significant 
ililTerenees in average gains ami>ng clients seen by different counselors 
and sigriifieant correlations bi'tween client-counselor similarity indices 
on >4*lrct«Hl p«TM>riality trails and the dependrut variable. These cor- 
rrhtirons w»t#- signilicant only for the ability variables. Dominance, 
s<Hial ptirtiripation. and so<ial presmee were the variables with the 
fiigli**sl ( orrelatioris. 

A j)oirit to \h* kept in mind in the alKjve studies concerns the different 
kiruls of dei»rndrrit variables einploywJ. Singer and Stefllre, Tuma, and 
I enifjloynl adjust*Hl diserepaney indic(S. Correlations between initial 
and final '^♦•If-ralings and ti*st seores have been used (2) as well as un- 
adjusted disi rt parx y iiulii-es. All of lh»s<*. it must he remembered, are 
inly internirdiatr rrilrria. not ultimate ont^. Oep^'ndent variables s<?em 
in ^'t rirral to \ar> in availiihility in\*Tse*ly with their inip<jrtance. Dressel 
hassnninii'd up IIh- ease \* r\ w«'|| in frlie fJ^Howing: 
. . . niiT rriil I'oMi iTn ... is only in part \*ith the here and now; 
the ultimate i'i>nerrn is with the years after completion of school. 
Lai'kin;; llir means f*>r r\; ^Mjsive fiillow-ups. reeognizing the dif- 
fi< nll> in attribution to i oiiriseling its exact contribution, and 
bavin;: a natural inipatifn«e fnr irmnrdiate aelinn. we turn to 
< ritr fi.i >((4 h j:raiJ«-s. ^'raduation. stay in scIukjI, stability, or 
•^ilisfai tion with ehoicr i>f major. Siieh criteria are not always 
applieabli' In all individuals in the same way ar I llwir relation 
to nllimatr goals is not ehar. (p. 71 ) 

I! I ma\ sumrnari/f and perhiips o\ er-simplify in <h»ing it appears 
lhal the s4)hjtion to the pmblrm of lu)W to make (est scores meaningful 
lo rlirnls lies irnbrdilrd in the interpersonal relationships obtaining in 
thi* <«)unsrling inliT\irw. MortoviT. Irehniques as such are probably 
not thr final ipjisliofi; ralher. we must seek to firjd those t echniques 
which can W applied by sehn ttHj counselors to appropriate clients. This 
is a largi» nrdiT. 

In the meantime. I would like to reite'rate my earlier point, namely, 
that wi' torn some attention to hridging the gap between our l<»sts and 



38 



1955 INVITATIONAL CONFERENCE 



our clienU. lam certainly not proposing any abandonment of the search 
for bitter and more meaningful traits, but tests used in counseling are 
to a considerable extent useful in direct proportion to their intelligibility 
and acceptability to the client. Both for this kind of enterprise as well 
as for the work to be done on devising and revising techniques we need 
criteria which are closer to the life situations in which decisions are made 
and acted on. Until we get these, our research must remain under the 
cloud of suspicion that clients simply learn how, during the process of 
counseling, to say things that will make the counselor happy. 

Since I have spent my lime talking about problems and areas of 
ignorance rather than laying down nice clean, simple, guaranteed rules 
for making test information meaningful, I am afraid that this may have 
sounded like that most pedestrian of all prose productions, the doctoral 
dissertation. Rather than closing, then, with a plea for further research, 
I will read a couplet of Alexander Pope's which seems to sum up as well 
as anything the job we have to do: 

Men must be taught as if you taught them not, 
.\nd things unknown proposed as things forgot. 

REFERENCES 

1. BenNKTT, G., SEAilHORE, H., AND WesH-Vn* a. C^iururi'ing Jro\^ profiles. New 
York: The PHycholdj^cal Corporation, 1,^51. 

2. Herdie, R. ChangtM in self-ratings as a 'm«thw)d «>f p^aliiatin^r v lUoscUinff. J. couns. 
Psychol. , 19>t, 1, 49-54. 

.1. BiXLEH, R. AND BiXLEit, V. T«4t iutff^fpiv i \i'nw iH} ^ *»cali</nal V(vuiUH<;l{ing. ICdw. 
Dsychol. Measmi., 1946, 6, 145-153. 

4. BOKDIN. E. AND BlXLER. R. Tcst SelttCtiflill : il H-xsm of coiiiiHu^linir. Edwr:. psycftol, 
Afrasml., 1946, 6, 361-374. 

5. Blroh, O. The fourth ntental meastiirentenls ywhoctk. Highland Vark, N. J.: 
(jryphon PresH, 1953. 

6. Drkhmf.!., p. Evaluation .>f counseling. In lk«r<iie, \\. U^.) Cotictpts and programs 
i}f counseling. .Minneapolis: University tvX M^nnf?s*)te Pcrv^^^ 1<^5L 

7. Dresseu p. and Matteson, l\. The elT**;ci of dieiit participatio« in ^test intro- 
duction. J. consuU. Psychol., 1949, 13, 82-*^*. 

«. Dremnri., p. and Mattehon, R. The effect of cfient participatio/Jh in test in- 
terpretation. £due. psychol. Meusml., 1950, tO,. 693-7(K>. 

*>. FiEDi^R, F. The rxmcfpt of the ideal therapeutic r«^Mloi\sHiip. consult. Psychol., 
1950, 11, 239-245. 

10. FiEDi^a. F. (U>mpariHon of therapeutic n^IationHithips in pnychoanalytic. non- 
directive, and Adierian theory. J. consult. PsychoL. 1950, 14, 416-445. 

11. Fiedler, F. Fact«)r analvHis of pHvchoanAlvtic, non-directive, and Adierian 
therapeutic relationships. J. consult. Psychol., 1951, 15. 32-38. 

12. (^rixiKARN, H. Intrinnic validity. Anier. Psychol., 1950, 5, 511-517. 

13. (ji;ntap, J. The effectM of differing methods of test selection and interpretation 
on learninK in the interview. Final report. Office of Naval Research Contrart 
Nonr 1225 (00^), 1955. Mimeographed. 

14. Gi/KTAD, J. Tfffnt information and learning in the (Xiunaeling process. Educ. 
psychoi A/eof/ro/., 1951, 11, 788-7915. 

15. KAMtf, H. AND Wrenn, C. Client accept once of self -information in counseling. 
Edue. psychol. Measnd.. 1950, 10. 32-42. 

16. I^ViwoNH, F. Choosing a rotaiion. New York: Ifoughton Mifllin. 1909. 

17. HoGRRH, ('. Psychometric tests and client centered counseling. Edtic. psvehol. 
Measmt.. 1946, 6, 139-144. 



tp:stin(; problems :,9 

18. RocEns« L. A comparison of two kinds of test intirprctation interview. J coaru 
PrycM., 1954. 1. 224-231. 

IV. Skaman, J. A study of preliminary inl«r\iewinK meth<>d!* in vfM;atif*na] counsel- 
ing. J. coniuU. Psycfioi., 1948, 12, 321-330. 

20. Singer. S. and Steffuie, B. Analysis of the belf-«»^tiDjate in the evaluation of 
counseling, J. coims. Psychol, 1954, 1» 252-255. 

M. TtMA. A. An exf)loration of certain metbodologicaJ and client-counselor per- 
sonality characteristics as determinants of learning in the counseling of college 
students. Unpublished Ph. D. disserUtion, I'niversity of iMaryland, 195.5 

\2. WoLFLE. D. Testing w big businesH. An\er. PsychoL 1947. 2, 26. 



60 1955 INVITATIONAL CONFERENCE 

The Obligations of the Test User 



A L E X A N D E R (; . W E S M A N 



The coiisciefitious publisher of psychological ami educational tests 
occupies an unusual, if not unique, position. Like the niaiiufaclurer of 
scientific apparatus, he is engaged in the production of instruments to 
meet the needs of professional people. IJke the book publisher, he 
faces the problems of printing, of editing, of working with authors and 
their idiosyncrasies, of copyrights. I nlike the manufacturer of scientific 
apparatus, who can assume that the physicist, chemist or medical doctor 
understafids the apparatus that is purchiist^d, the test publisher can 
make no siniiliir assumption. And unlikr the bcxik publisher, who does 
not need tn concern himself with who reads his liooks (except that it 
be as nuuiy as possible) the tt»st publisher must be constantly and 
actively concerned with those who use his pnxiuct.s, lest those products 
fall into improper hands. 

Further to compli<*ate mntters, the ethical publisher, having r«»stricted 
his market according to the dictates of his <*ofiscicnce, still fimls himself 
with purchasers whose preparation for the use of the published materials 
varies from complete knowh'dge and consiclernble sophistication to little 
or no traimng an<I dismaying naivete. 

The dictates of his conscience are not the only fuoriil force acting on 
the publisher of educational and psychological tesis and techiii(iues. In 
recent years, much time and thought have been devoted to the con- 
siderati<»n of his obligations to the professions and to the general publi<*. 
('onunittees on Test Stan<lards have iH'en appointed by the American 
Psychological A.vmh iation, the Arnerit an Fjiu< atioiial Hi-.seiirch Asso- 
ciation and the National (!<»uncil on Measun rnents I sed in Education 
for the express pnrfiose of formulating spe<'ilications for tests and test 
manuals. The co<ies which emerge*! as a result of their deliberations 
have been rep<»rt<' ^ by these asMX'iations in two pamphlets, copies of 
whi< h should be in the hands of every test user. 'Jliey are, nil the whole, 
very sound d<H nmeiits; one hopes that the moral pressure they try lo 
exert will, in the long run, prove beneficial. 

Additional pressure is also directe<l at publishers by Buros' Menial 
Measurement Year})ooks, by te.st review forms in lextlxKiks such as 
Cronbach's <ir Thorn<like and Ilagen s, and the critical reviews which 
appear in proftssional journals. These inlluences are f<irces for the good. 
How effective they really ore, is unfortunately a matter for dispute. 



TESTING PROBLEMS 



61 



Just a year or two ago, at one of these ETS Conferences, Oscar Euros 
offered the exasperated judgment that tests and test nnaiiuals published 
in recent years are not as good as many of those published a quarter of 
a century earlier. I doubt that many of his colleagues would adopt a 
similarly extreme position. At the same time, there are those of us who 
are cynical enough to believe that the mere existence of recommenda- 
tions and reviews does not ipso Judo improve the quality of irislrumenls 
olFert^d to the tist user. 

Over the years, it is neither the publisher nor the critic who most 
eireetivdy d' tMrmiiu's the quality of tests; rather it is the test user. 
I nless the Ic user knows what a gcK>d list is, and withholds support 
from those which fail to meet high standards, the recommendations 
enunciated by orgafiizational coniuiiltees will be worse than inefTeclive 
- they will be put to harmful use as just oruf more device for deluding the 
innocent. .\ statement such as "the author has considered the Technical 
H(Monimendations set forth by the APA, AKHA, and NCMUE in pre- 
paring this inafniar* could provide an aura of respectability which a 
given manual may not deserve, and the uiicrilical might well be misled. 
There is no (mkkI Housekeeping seal of approval in the field of test 
publication; there is no substitute for pmfessionally competent and 
conscientious judgment on the part of the lest us<t. Test publishers 
have important [)rofessiorial obligations: lest us<ts Inive parallel re- 
s[>onsibiliti(^. 

Test publishers should refrain from making unsubstantiated claims 
for the validity of the tests they oir<T; they should distinguish between 
what they hope, and what has be<'n demonstrated. Test users should 
also be able to distinguish between what is liope<l and what has been 
demonstrated: they should reji»ct <'\agg<»raled claims of merit despite 
the attractiveness of the inamial's format or the emiiK'nce of the author. 
\ alidity is a matter of the content of the test and the Mlualion in which 
it is used. It is not assured by either the renown of the writer or repu- 
tation of the package dtsigner. 

It is proper and desirable for researchers to try instruments in new 
npplicjitions. One hopes, of course, that in the original selection of tests 
to be tried, some reasonable hypotheses have guided the researcher in 
his choice; this is not always the eas<'. In any event, the researcher can 
make more or less of a <'onlribution by publishing his results. If his 
results are positive, they serve to alert otlu'rs of new situations in which 
a test mny eflVctivc; if negative, other researchers may be spared 
the futile elFort of duplicating the experiment. 

However, if the user applies a test in a situation for whicli neither the 
author nor publisher intended it, a negative result should not be con- 



62 



1955 INVITATIONAL CONFERENCE 



Btrued as adverse criticism of the test. It may more appropriately be 
announced as a failure of the researcher's hypotheses to stand up. It is 
ironic that publishers and authors should so often be blamed when 
tests won't do what they were never intended to do; when the only fair 
comment on a study is **why did the researcher expect the test to be 
useful under such conditions?" The publisher may properly be taken to 
task if liis tests don't work when they should; the tests should not be 
criticized if they don't work in situations for which they were not 
intended nor recommended. 

The summer issue of Personnel Psychology contains an example of 
this abuse. A group of graduate engineers was given a series of tests 
including DAT Space Relations, Mechanical Comprehension BB and 
Otis Arithmetic Reasoning. The authors of the article reporting this 
research express surprise that the tests failed to discriminate among the 
engineers. The proper occasion for surprise is that these testa were 
chosen for use in this situation in the first place! They are good tests 
for the populations and purposes for which they were intended — high 
school students or unselected adults. That tests published for these 
levels do not yield adequate distributions for a srroup which has had 
intensive academic and professional experience with mechanical forces 
and advanced mathematics is hardly noteworthy. If the Miller Analogies 
Test, or the Mirmesota Engineering Analogies Test, or th^ GRS Advanced 
Maihemalics Test was not discriminative, we might criticize the test; 
with the tests selected for this study, we can only question the wisdom 
of the researchers. 

Test publishers are constantly engaged in amassing evidence con- 
cerning the validity of their instruments in various applications and 
with different kinds of subjects. Test users muat recognize that unless 
they provide the subjects to be icf ted, the needed data cannot be ac- 
cumulated. Few and far-between are the cccasions w^en a test publisher 
has a captive group of subjeris at his mercy. More typicaliy he is v^'holly 
dependent on the cooperation of the school adrrunistratcr, counselor or 
teacher. The user has the right, and the duty, to refuse to buy a test 
which lacks proper documentation; he is also under some obligation to 
accept his proportionate share of the burden of providing a situation 
in which evidence concerning the test may be galherei during its ex- 
ploratory, standardization and validation phases. It should not be left 
to the cooperative minority to provide the necessary subjects; all schools 
which hope to profit by the existence of good instruments should par- 
ticipate in experimental programs on appropriate tests. 

A similar point may be made with respec^, to already existing testa. 
The publisher who neglects to collect serviceable normative data for 



67; 



TESTING PROBLEMS 



63 



his tests is properly to be criticized. Is less criticism due the non-coopera- 
tors in the schoob—and in industry, government and private practice— 
who have useful normative data in their files but do not make those data 
available to the publisher and, through him, to their colleagues? How 
many millions of test scores repose in dusty files, or have been destroyed, 
which could have augmented the norms in the hundreds of manuals 
now in print? 

Publishers should not over-emphasize the role which their tests should 
play in the over-all evaluation of a student, employee or client. The user 
might well apply an equal sense of perspective. It is difficult to say 
which has done the testing movement more harm — the naive optimist 
or the equally naive pessimist. The optimist looks to tests to solve all 
his evaluation problems— in effect, he surrenders the responsibility of 
personal judgment ifi exchange for the luxury of having something else 
make his decisions; often, it is a "something else*' which was uncritically 
chosen in the first place. He operates as a clerk rather than as a pro- 
fessional man. 

The naive pessimist, ofi the other hand, casts his jaundiced eye on 
the acknowledged limitations inherent in even the best of o'lr tests. 
Though he would probably not say so boldly, he rejects the tests, in 
effect, because they don't have perfect reliability or perfect validity. 
If, in spite of his protestation, tests are used in his school, he warns in 
doleful tones that the scores must not be used alone as a basis for 
evaluating the individual. 

We have no quarrel with tlie principle that a single test score—or 
for that matter, a series of test scores— should not provide the sole basis 
for action of any kind. Publishers typically urge users to correlate the 
information obtainerJ from tests with all other relevant information that 
can be obtained, including grades in scliool, anecdotal records, physical 
reports, social workers' reports and whatever else local facilities permit. 
Our quarrel is that the naive pessinn'st wears blinders. 

It is true that tests are not perfectly reliable or valid; is perfect 
reliability or validity to be found in grades? in anecdotes? in teacher 
observations? It is likewise true that tests alone are insufficient evidence 
for total evaluation of the student. Are we, however, to be satisfied with 
the evidence we obtain from grades alone? from anecdotes alone? from 
social workers' visits alone? One wonders whether it is not a sincere 
compliment (though perhaps unintended) that tests are singled out for 
warning with regard to their use in isolation; could it be that test scores 
arc the only kind of information which would be considenid tempting 
enough for such use? Nothing is that good, of course— but it is interest- 



64 1955 INVITATIONAL CONFERENCE 



ing that no one ever warns us about the isolated use of anecdotes or 
teacher observations. 

The publisher has the obligation of keeping abreast of new develop- 
ments in educational and psychological principles and practice, and of 
building tests which will reflect those modern concepts. The user is 
equally obligated to understand these newer instruments and the ideas 
they represent. We are sometimes told by administrators that, while 
they approve of intelligence measures with diffeiential scores, such 
instruments can't be used because their teachers (or counselors) are used 
to the simple, single IQ. Is this reasonable? These same teachers are 
expected to look at a cumulative record showing grades in a variety of 
subjects and extract meaningful information. Multi-score achievement 
batteries are the rule, almost without exception; yet the teachers have 
presumably learfied to interpret results from these tests. Why, then, 
should teachers and r()unselors be accused of inability to learn to in- 
terpret several scores on differential aptitude or intelligence tests? The 
logical answer seems to be that they can learn — if the administration 
takes its own responsibility seriously enough to provide the opportunity 
and motivation for learfiing. Modern medicine requires the general 
practitioner to understand the properties of modern wonder drugs. 
Mfxlern education rccjuires modern testing embodying modern concepts 
— and a willingness on the part of educators to continue their own 
education. 

The preparation of a manual which provides the necessary instruc- 
tions for administration, scoring and interpretation is an obvious duty 
of the publisher. P\»llowing those instructions is a parallel responsibility 
of the user. Kv<Ty one of us, I dare say, has seen impossible scores re- 
port <xl on answer s Inlets, in personnel files or on cumulative record cards. 
I HM-all, for exjiinph', a set of records from a N<?w York City school 
w hich ront;nncd half a dozen or so IQs of 400 and over, twice that many 
in the :i(K)"s and as for IQs of 200 or .so, they were quite routine. I recall 
also a high school t<»sting in Nebraska in which all but three or four of 
the seniors s(Wod above the ninety-fifth percentile (national norms) on 
a clerical .speed test. As my daughter would say, "Somebody goofed!'' 

One hopes that no n^sponsihle persOfi gave sericnis credence to such 
outlandish scores, though their presence in official records does make 
one wonder. More serious than these dramatic bits of nonsense are the 
thousiinds of h»ss dramatic, and consetfuently less conspicuous, scores 
which s<*eni possible enough hut which are really incorrect reflectioiis 
of the t<»stee's ability — niislending information as a result of someone's 
failure to read and lnvd tt*8t manuals. 



6"2 



TESTING PROBLEMS 



65 



The list of users* responsibilities could be expanded almost indefinitely; 
the points selected above are illustrative rather than exhaustive. The 
whole matter can perhaps best be summarized in two sentences. The 
publisher should feel obligated to prepare iiislruments which earn the 
users respect by being psychometrically sound, conceptually modern, 
and administratively and economically practical. The user is under an 
even stronger obligation to cooperate in the development of these in- 
struments and to support those which deserve support — not only in 
terms of purchase but also in terms of intelligent application and interpre- 
tation. The best portrait painter in the world would be handicapped 
by a house-paifiter's four-inch brush; the finest artist's brush obtainable 
would create no masterpiece in llie hands of the untutored. 




66 1955 INVITATIONAL CONFERENCE 

How Basic Organization 
Influences Testing 

DAVID H . D I N G I L I A N 



I. Introduction 

Both World War I and World War II had a marked influence on the 
use of tests by the entire nation as well as by education. 

Without afiy intent to be historically accurate in detail, it might be 
safe ti; say that, broadly viewed, World War I and the period immedi- 
ately followifig it seemed to give impetus to the use of tests for the 
purpose of ffieasuring and appraising the individual During and since 
World War II tests have been used on a much broader scope. The users 
seemed to have added to the purposes of measurement and appraisal 
the (Jesire to assess and understand the individual by way of counseling 
and other techniques. 

Within learned circles of professional test users there undoubtedly 
exists n cautious esliinntion of the values gained and progress made 
because of the us(,' of scientific tools such as tests. 

One cafuiot help hut wonder, however, if there is a parallel attitude 
on the part of the legion of inexpert test users who were willy-nilly 
thrown into the business of using tests without any disciplined orienta- 
tion or training concerning the possibilities and limitations of those tests, 

I lere is a vice-president of one of the country's largest financial houses 
who says. * You psychologists sure have a black record with oSrcutfit!" 
T\wn he prtx^eeds to name three persons who came, tested,, collected 
rather fat fees, and departed. Not too much was left behind in the way 
of insights about tests, exc/'pt the hindsight of the management which 
indicated that it had an over-abundance of confusion and considerable 
unre.«olved feelings about ''psychologists/' 

Mere is a teaching colleague who comes to this writer with the de- 
lightfully naive, but nevertheless disconcerting, question: *'I sent Jim 
down to the connselor and he tested him. Why hasn*t his behavior 
irnprov*Hl? He was tested, wasn't he?" 

Whether it is justifiable or not. we must face !ip to the fact that too 
many pwiple have the uneasy feeling that psycliologists specializirig in 
the area of tests have promised too much too fast and, thus far. delivered 
altogether t(Mi litth'. To ke(»p this kind of feeling from compounding, we 



6V 



07 



must fiiid ways whurein business and educational instil ulions, state and 
local, can use tests under greater scientific auspices. 

One way to cut down further uneasiness about the use ol tests is to 
invent ways whereby psychologists may be more active in testing pro- 
grams. Let us take one field as an example. So much of the activities of 
the broad area known as counseling and ^^uidance emanates from the 
seminars, practicunis and laboratories of the behavioral sciences, par- 
ticularly the discipline of psychology, ilow tragic to make such a fine 
contribution, Siiy, for example, to the vital realm of public education, 
and yet have so many prof«*ssional psychologists not close enough to 
the pulse of the activities to which they have given impetus! 

In the main, departments of education and educational psychology 
have shown more interest in, and rendered greater professional assistance 
to, the test users than have departments of psychology. Yet, ironically, 
the large bulk of th»* tests themselves came from the fertile genius of 
rigorously trained psychologists. The question has occurred to this 
writer many times as to why psychologists do not seem to want to follow- 
through to help toward the proper incorporation of the tools of their 
discipline into the organizational structure of such areas as education 
and industry. 

It took the American people a long time to accept the common schfK)l 
as a ba.sic institution. It t<K)k another sevenly-iive years to make the 
modern .\inericnn high scImk)I part of the pattern of free public educa- 
tion. I^ecently, signs iinlicate that the junior college will be the next 
widely accepted institution. As one begins to rejoice about snch matters, 
he senses that so much is being invested in buildings and excellent 
curricula without an equally adequate investment in orienting pers<)nnel 
to scientilic riicthods. It almost looks as though we are guilty of extend- 
ing a .sentimental invitation which says, "Come one, come all - we an* 
the .servants of the next generation and offer you free education." 

There must be an end to the developing of expensive and highly 
.Hpecialized schools and curricula without clearly defined criteria as to 
who .shouhl study-what, how long, and wlu?re. lintrance requirements 
to nnmerous educational institutions are still entirely tt)o devoid of any 
careful appraisal of the talents and abilities of the applicants. Most of 
the.se institutions are cinltered up with pupil-persoimel who.se assets 
probably consist of gmnl intentions, a willing staff of public .servants, 
and the denuM'racy of opportunity. 

Perhaps it would be appropriate to repeat a question rai.sed by Daniel 
Starch in his address to the Invitational Confi-rence of 1954. Some may 
remember that he stated, *it is fair to say that more advance in scientific 
knowledge has occurred during the last fifty years than in all previous 



63 



68 



1955 INVITATIONAL CONFERENCE 



centuries combined." Then he asked, **Why is there so wide a gap 
between what we know and what we do with what we know, between 
knowledge and the wisdom of how to use that knowledge?" 

My first point is a disconcerting one. As a person who has spent the 
last ten years attempting to bring the use of tests into greater and 
greater prominence in education, I have found myself altogether too 
lonely. Persons in such roles as director of guidance, or director of a 
counseling center, are caught between a vast public, which wants more 
scientific help, and educational administrations, which have not been 
suJficiently structured about the values of tests by high echelon organiza- 
tions from the professioncil area of psychology. 

As one views the current scene, other than the reasonably favorable 
envirofis of the college and university campuses, he cannot help but be 
seriously disturbed about what has happened as a result of separating 
the tests from the ifisights of the test makers. Those who distribute 
tests may be said to be doing a good job of "selling" the idea of the use 
of tests. But, fio matter how noble their ellbrts, their motivations are 
still linked by the test users with promotion for profit rather than with 
zeal for scientific rigor. Whether this is true or not is secondary to the 
fact that such motives are imputed to them. 

In contrast to the harum-scarum growth of the use of tests in most 
elementary and secondary school situations are the numerous excellent 
student personnel programs which have been developed in many colleges 
and universities. There» ofie has a sense of orderlifiess and a feeling of 
experiencing closure. There is a rationale present. An appreciation of 
the values and limitations of tests is constantly a part of the multi- 
faceted services being rendered. The user of tests at the college level is 
also, in most instances, the renderer of the service or the supervisor of 
those whom he is training to render service. Sound organization is 
present. Expediency and improvization are the exceptions rathe; thau 
the rule. 

The college or university pattern seems to include the setting up of 
structure and basic organization prior to the rendering of services which 
involve the use of tests. Why such a pattern has not jelled in elementary 
and secondary sch(K)ls is a question which a conference such as this 
might explore. 

It seems to this writer that the matter of centralization versus de- 
centralization, organizationally speaking, is a false dichotomy. When 
one ' 'ews both types of approach he soon sees that they can both be 
goo<l or bad. The main variable may reside in the answer to the question, 
"Is there a doctor in the house, a doctor of psychology, that is?" One 
who knows tests, has constructed them, can teach the amateurs how, 



6'6' 



TESTING PROBLEMS 



69 



as well as how not. to use them. Most important of all. one who is 
sympathetic as well as flexible about the kind of in-service education 
which he himself may need. It is equally important for him to understand 
properly the purposes and objectives of the organization or institution 
which has sought out his scientific skills, 

IL An Example of a Testing Program in Action 

Having indulged in quite a bit of **Monday morning quarterbacking'* 
calls for some kind of a constructive suggestion. What does happen 
when staff members of an organization take time to work out together 
a philosophy, orgafiizational structure, and patterns of rendering serv- 
ices? It occurs to us that perhaps an example of a specific organization 
and its step-by-step evolving of an approach to the use of tests within 
the framework of counseling service might be of some interest at this 
point. 

This writer was for eight years director of a counseling center which 
started out as a unit for serving veterans only. Within three years it 
was expanded to include a nine-to-twelve-hour service to graduating 
seniors of sixteen to eighteen high schools. The rendering of such services 
created kinds of problem situations which could have caused the organi- 
zation to move in the direction of operating, either from improvization 
and expediency, or from a base of theoretical constructs which gave 
the staff a feeling of scientific precision and discipline. 

When fully staffed, the center had a total personnel of seventy-five. 
The number and classifications of the staff were: fourteen psychologists, 
one of whom was the senior psychologist; twenty-eight counselors who 
came from a background of education; one supervisor and four assistant 
supervisors with backgrounds of education; and twenty-eight clerks, 
one of whom was principal clerk. 

For nearly five years, this staff ^ rvt : a total annual case load of 
eleven to twelve thousand persons 

Ways of functioning which were entirely new had to be worked out. 
The staff decided to give priority to the evolving of a philosophy and 
pattern of organization. The fact that no previous organization, and 
hence no precedents, confronted the group was viewed as both a hazard 
and an advantage. The staff agreed to schedule only that amount of 
case load which would permit sufficient time to formulate, during in- 
service meetings, the beginnings of a philosophy and organizational 
structure. This firm resolve not to permit the pressure of case load to 
sidetrack adequate joint planning proved to be the most fruitful de- 
cision in the history^of the guidance center. 



67 



70 



1955 INVITATIONAL CONFERENCE 



A. Philosophy 

In order to deliiie a philosophy to use as a frame of reference, the 
stair listed six basic tenets. It isn't within the scope of this paper to 
elaborate the details of these tenets. A sunrimary of the main ideas can 
only give us clues as to the values of consensus-making. 

First, the complexness, as well as the vastness, of modern knowledge 
caused the stafT to fed a need for synthesis. They realized that they had 
to be both specialists and generalists. 

A modern synthesist, they decided, would create a mosaic made up 
of the findings of the various disciplines which deal with man. He would 
build a dynamic frame of reference which would depict **man and 
culture" as process, as change, as interaction, and as a continuum. In 
this approach, the synthesist would move from a point of departure to 
a point of view which, when suflieienlly strengthened and enriched, 
could become a way of life; a way of life which would have one main 
objective: to instrument the scientific method toward serving the needs 
of our democracy. 

A second tenet had to do with the hypothesis that all behavior is 
caused behavior. The individuars early enviroimiental and inter-per- 
sonal relationships as molded within such frameworks as the family 
con.stellatioii and peer groups, have given him an approach to life which 
he is now living out. The point which the stafi* agreed to emphasize was 
the necessity for approaching cli(»nls with an attitude of not condemn- 
ing or condoning behavior, but understanding it as caused behavior. 

The third tenet was viewed as being rather closely tied in with the 
second. Studying the outgrowth of nearly fifty years of clinical work in 
psychopathology aiui other related discipliin's. and certainly, the in- 
lluence of Rogers and his associates, helped the stalT to explore the 
assumption that the client has-^ndlig^pacity and wisdom to help him- 
self. They concluded that the client, if given certain psychological con- 
ditions, could reorganize his **field of peneplioir*; he could alter the 
way he s(jes the field as well as himself, and h(Mice modify his own 
behavior. 

The fourth tenet of the slaH's philosophy had to do with tests and 
testing. They agreed to be very careful that their possibilities, as well 
as limitations, were constantly kept well defined. They experienced the 
numerous ways in which tests so easily lend themselves toward being 
used as crutches. Counselor X would catch himself riding this interest 
test. Psychologist Z woidd becomt* emotionally identified with that ca- 
pacity test or projective techimiue. For the entire stalT, tests and testing 
had to become just one of a series of factors making up the configuration 
which they decided to call advisement. 



TtiiTLNC; PHOBLEMS 



71 



Having cautioned themselves about pitfalls, they agreed to develop 
the mofit coinplete and up tCMlati* t«st library which could be had. 
Morr thjin 22r> of the s^i-rjillrd Ur^\ sliindanlizcil tests wero procure<l. 
Ilir ps)cho!o^Msls Hrn- r\rr ot) \hv \\rrt fni new tcirls. n«'W norfii lU^ta. 
new validations. Four main objectives in regard to icsis an<l testing' 
were Hgn*<Ml upon. Tht'V were: fa) to u.s/* tests as means, and never as 
efuU: ^h) to hiive on hand the most scientific t<Hils in order that the 
diii-rriostir work of the sl.iir eonld be considers! adequate by the best 
seirnli^fs in the \iv.U\: (v) to Jivoid the pitfall of assuming that proper 
<fia>rno'-i> is all that is nejnlwl. (hat it ends all the nei^ls of all the i lient.s ; 
and U\) lo l>r rnnstantly iiware of the spirial implications for test-users 
of the Wfirk of .iliifd lields. 

In fornnjlati/jt: fifth r net in their philosophy of guidance, the 
slnfT rero^Miizc'l the nerd f<»r iin iidefpiate concept of change. This meant 
discardi/iL' an> ti inh rn > to m c the illusory safety of absolute truth as a 
"thinf:" to <linK to. iiiid substituting the more challenging scientific 
••nietho<l." (!oiH t pt.s c;unf to be vaiu^nl in terms of their utility rather 
than vnlidilN. \s the ijronp sou;:ht io avoid the rigidity which comes 
with unrhjinpnt: v.dues. the) tried to Iw wary of such dichotomies as 
nature >ersus nurturf. introvt-rt versus extrovert, or gCHxi versus bad. 

I he strfh \vt\r\ >tresM d (he im|M>rtance of having adequately de- 
lineil df-nuK ratic inuis and mefluHls. The group agre^nl that democracy 
is an idea whi( h is nnieb hroiuler than a mere pfilitical or economic 
diK-trine, They found that where if had \hh*i\ instrumented, it had 
tendf^l to b4>eouie a Ha> of life, bnlh HfK'ial and individual. Tl* igrewl 
that its ernx lav in the phrasf "informe<l participation.** 1 . aff 
nienilHTs who e;unf t«» hf^lieve sinccrrly in demtKratic consi*nsu. Aing, 
eanie to know intim;iti l> the fact that where there is no participation 
there eoniev into ph»N a subtle mui insidious form of behavior. It changen 
the climate M> that iM PMais in authority llirt with the idea of suppressing 
further partlripafinn. PersoTis under authority be^\i\ In do their job 
with tli»» ;Utitud*-, "I mert ly work here." There l^egins a passivity, an 
inililTerenee. I hf-M-. in turn, ^'row into a s<irt of game between tlu^we 
who rontrrki ;in<l thos»- U'ing eontrolleiL 

DurioK the tirtual functioning of the guidance center, the staff c*.me 
to apprf^'i^te mop* and more the way in which an adequate philosophy 
najlil rtfft^ct the quality of jwTvires n»nden»d to the client. Among them- 
s^'lves. it r<»fitertHl stimulating anfl worthwhile discussions as to the 
mrrit.H and demerits *>f liilFerent tools. .\nd. in regard to tests, their 
philosophy caus«Hl tin* usern to stop, IcKik, and listen. 



6!i 



72 



1955 INVITATIONAL CONFERENCE 



B. Organizfition 

OiMv tlii* iMTiiininpi of a philf>8ophy wore outlined, the next step 
srtMiini ohvioiiH. iianioly, implementation of this philosophy into 
sfMM ilir orpinizjiliori;il slnirture. 

Following: {in* w few of the (i<^'i.sions whicfi wen- ii^rrtMl upon hy the 
eriliri' sljilT: 

•A] IM.umin^r ;iimI poliry-rii ikiii^^ sftoiihl iih ludr not only the ad- 
rninislrjili\ r and supi rx isnry prrsomu l hut jds^i a rejin srntiitive t'lected 
on a rf>lalin^' ha-^is fr4»ni Ihf rlrri<al, psNrliolot;iral and counseling 
prrsonnil. 

{2) rids ^froup would In* kriown jis thr adnunistrative rounril. ICath 
UhMuhrr, irn'sj)irti\r of rank or rljssifiration, would have tM|ual re- 
sfxjji^ihiiil y to place on I lie n ^'idar ap nda items w hich he felt to be 
imporfant. 

{\\) In addfMon to the ri'^fularly srhedidcd meitinpi of the ad\i:iuis- 
Iralive ronneil. earh 'I liursday aflernfxui would he set aside for slafT 
met linr^. One of tln'^r would the rnonlhly general stall* meeting. 
The oilier I hnrsda\s would he used for special meetings of clerks, 
eoiniM'lors atnl psNrliolo^fjsts. 

i\) Hesponsil)ilil\ for eondu<ling the nieetings would rest with the 
>:roup < <)n< i rnt'd. (a) I fie p neral slalT meetings wouhl b<? the responsi- 
l>ilit> of the ht ad super\isor or of any of the other six supervisory 
assistants on the t !rnter's tahle of orgjinizatiou. (h) The other meetings 
woidd he the joinf respon .ihilil y of the assislanl supervisor in charge 
of irfM r\iri- iihiratimi and the t'h^ted repres**nlative of each of the 
tlirt e elas>iti( atioris of personnel. 

ir»i I he nature of llievr niet'tlngs would he: fa) business items perti- 
nent to earh of the Ihn-e ^'roups: f|)> in-servi<e education appropriate 
for i'a< h ol fh' tlirei- rojrs. 

If the ahoM- uiaehinrrN for eomniunication did not ad./|uately 
faeililalf llie two d» sin d \;dues of maximum participation an<l <'on- 
sensuH-makin^'. the adminisf ralive council would eiT<Mt tlie necessary 
etianp's. 

Vgain, We nnist sa\ that >:r«ater detail on organization, than that 
outlinrd iImim*. ( annot Im- t:oTi.' into lu cause of the limit***! pur^ tme of 
this pap* r. It is erri lifiK lru»- that su< h a structure became the anvil 
on whii h was hammernl out < Ner\ item that sonie statT niemb<T wanted 
explop'il. The ilenis ran the KanuJt from uses an<i abuses of the 
nMirnintr and afttTm-JU eolTee breaks to >st prolonged him\ de- 

taih'il disejMsions of iIm- rt lativr merits of ll ,*sus that l**st, as In-ing 
the ni'tst \aluahlf t*H»l to measure interest. ^Mrilities. itemjHTament or 
aptitudes. 



TESTING PROBLEMS 



73 



C. Operations 

Suiru- one has saul tli.it pliilusophy will not biillrr your briNul. hul 
it will rnakc it la>ti» beltrr. I hi- stafl* at th«> piidunre rniliT ramr to 
appm iijt»r this niaxiin grrally. Mir op«Talioii;il plan wliirh nuanaUMj 
froin that philosophy was htiilt on lour ha.sir roiurj)ts. First, it would 
bii {\w rf.spiinsihilit) of admiiiislralot h and Mipt rvisors. at all lirnrs. to 
fiirilital** and not fru^tralr Ihr pnxrssrs of rrndrrin^' ^'uidanrr M-rvirrs. 
Second, rslahlisliinj: a How nf wi'llnlrlinrd and ordrrly s»rj)s was in>- 
perativ.*, U^ ausr thr <|ijalit> of tlir rnd priKluct would ho dotrnninrd 
by thr naliirr and quality of thr dillrrrnt piidanrr rxprrirnt rs ollrrrd 
to thr rli.-nt. Third, thr rlimt would liavr to br involvrd to thr niaxi- 
iruini in rarh of thr strps wbi<h hr would rxprrirnrr. FiHirih. thr 
wrakrst or tlir strofi;;rst staff un nib<T. he hr a ruunsrior. clrrk. psy- 
^holop^l. or administrator, wouhl br strrnjxt lirnrd by ibr pr<Krs.s'if 
thr various strp^ wrrr wrIUIrlinrd. Thr iissufnptiori Was that thrrr is 
an objrrtivr kind of dis< iplinr involvrd in knowing' what oprrationai 
puttrruN p> into ;:rar in rr^'ard to siu li arlivitirs as thr intervirw. trsl 
pn^srriptioii. nv of (M cupational infornuition, or tentative srirrtion by 
the rlirnt of thrrr to fivr nbjrrtivrs. 

Lrtush;i\ra l<M»k at thr slips itivol\rd in advisrnirnt; (I) SIructur- 
ing 'I his ran br dour on iin indi\ idual or ^'roup l);isis. at thr rrnlrr by 
thr intake suprrvisor. or at a srluH)! l>y atw of the members of the team 
of eoufisidors and ps>ehoh.f:isls. The idea is to deline thr strps of the 
srrvire. to trll what it do«*s not do as well as what it do<*s. to inipn*ss 
upon thr elirnt the irnportanre of his n^lr in increasing: or d*»*Teasin^' thr 
rirrrlivrnrss of thr srrvirr and to ser thai he has a ehariee to ask qiies. 
tions. Ihr rlirnt nrrds hrlp in srrin^' thr point of view of thr staff 
rrjrardiuj: the ^Mthrrin;: aii<l sharing' of farluid information, in eontrast 
\i) thrir rrlui lartrr to ir)lt rprrl. iidvise or prrsrribr, 

\2> liti.<ir Ttsfif'ff \u iKilrrrsI invriilur>. a ni. 'it.d r;iparity trst and 
a prr>on.ilil\ t.>t an- .Hlniinistt-rrd to rarh eliml. \s ji sirr^de elirnt he 
ma> br ;d>lr to iro from >trnrtui in^' dirrrM) h» tr>tin^'. As a nirndx r of 
a Mip in onr .,f tlif m Iwm»Is. hr ini^^bt lia\r a tinir intrrval of one to 
Ihn r da>s Im turrn slrurturiii- and basi*- trsrin^v 

Fir.<f fnftT.' trn Thr rlirnt di^ nv^rs uith a ronusrlnr thr following: 
dala: l a hi^ p» r-onal-<«H iai lia< k^rnanid : (b) thr rrsnlts of thr basie 
f^'sts; and r» thr making of a trnliiljvr list of ten to twrnty vi>< ational 
objrrtivrs whirh arr « onip ttihj.' with thr elirni s bark^jround infornia- 
tfoti and busir trst d:ita. 

ii \f»tttutie T*'.siin,j I hr rlirnt r\ph»rrs with <Jnr <»f thr psNrholo;:ists. 

his « ouns*'lor. or Uiih. th" kin<l <»f aptitude tf^stinp wliieh ii> if dirated. 



7l 



74 



1955 INVITATIONAL CONFKRENCK 



The counselor needs to bv. sure that th** information compiled in the 
lirst interview in the basis frurn which dues arc gathcnnl for the kinds 
of aptitude t«»sting which arc agreed upon. 

(5) Second Interview: The counselor and the client review the results 
of the aptitnde test, and relate such information to otlirr test data, 
background information, and th«» list of ten to twtMity tentative objec- 
tives previously selecttKl. If no furtlirr t«»slin^' is indiraUd, the client 

Irrts three to five of th«' most inijKirtant vo< ational (•hoict'ii whirh he 
and thir counsehir agre*' upon. These may all be from th«' list of ten to 
twenty, or one or I wo new ones may be added. 

(6) The client iiiay invest the mininnnn of a half-<lay in studying his 
three to five objertiv*?s in the center's <x»cupational information library. 
There an (i<x'upational informal ion specialist and clerii al assistants 
make available to the client and the counselor the kinds of data relevant 
to the client's v(x*ational choicL»s. 

(7) Terminaling Interview: After finishing his study of the three to 
live rhoice.s, the client returns to his counselor to discuss those which 
sef-ni to Ik' the most realistic for him. Together with the counselor, he 
outlines a plan for (a) further wJucation. (b) immediate employment, 
(r; a training program or (d) nrrd for further rounseling if noii.> of the 
first ihrei* ran l>e agree<l upon. 

(8) Parent Interview: \ \m\u completion of the guidanre experience 
the couns^dor disrusses with tht» client, if he is a young student rather 
than an adult, the availability of time for a parent interview. If the 
client com ars, he is aske<l to sign an inforntation cons<?nt sheet, so that 
the cours^rS jr may feel free to discuss data alK>ut him with his parents. 

f9) He-rmlimtion: The client is invited to rome back for any future 
review of his plans, if and when such a step Iw'comes a felt 

In this brief statement we have attempted to comnujnicate two ideas. 
The first one has expn*s.sed our dissiitisfaetion with what we see on the 
currtMit scene in regard to the use of t<»sts. It seems to us that the 
scientist hjis made exreUent progress in the area of inventing new to<i!s 
in the fi>rm of t<^t.s. He has not. ho^over. stayed on the s«-ene in order 
to hold to a minimum those iia^vilablt- abuses which he alone could hav«» 
antiripate<l when he saw tesis fiillling into the hands of imvxpert usits. 

The rigorous standards for prujMT test us;ige which gtKxl psychologists 
adv(M'ate have not b#»en adequately upheld, even in the universities 
where student personnel siTvires are so close to the birth of n^arch. 

The second thing which we have attempted is a brief dt?scription of a 
counstdifig organization \*hich canu* to us*' tests in as scientific a 
manner as it was able io devise. Our assumptiim was that in such an 



TFSTING PROBLEMS 



75 



example we would find clues as to how basic organization influences 

the use of tests. 

lU. Some Generalizatiorui 

Wo an* ii«>w ready to make a f»?w generalizaiions. We label them as 
surh with the hf)pe that perhaps some of them might contain the seeds 
for eventual restateriierils in llie form of hypotheses. Those could be 
l<»ste<l in future research, for example, on the nature of the relationships 
U'tween organizational structure and test usjige. They are not worthy 
of being called hypotheses at this point. 

(1) The first generalization which occurs to us is that, in the light 
of what has been said in sections I arul II» the title of this paper might 
wi»ll b<' changed from, **How Basic Organization Influences Testing/' 
lo. ilow Individual Needs of Clients Influence Both Basic Organiza- 
tion and Testing/' It is our conviction that the self-realization ne^nis of 
human beings si»i»king lielp, when r(»spt»cted and listeniHl to, facilitate 
the setting up of unique organizational structure and creative t(^t usage. 

(2) Tlie following innovations, wliich grew out of the experience of 
the organization are irsfMl as an example and have significance for the 
client, tlie proH'ssional staff s*Tving hini, and for education. 

Tliese innovations are: (a) Clients nnist he given an opportunity to 
btM'onie awarf* of the possibilities and limitations of any guidance service 
wlii< li tliey seek. Tliis makes for eeononiy in relationships and budget. 
Sncli reonomy is d(M'uniente<] by the fact tliat the screeriwl but not 
returni»<l (SNB) percentagi* of tlie Center remained b«?tween two and 
three during tin* first < il^IiI years of its existence. At one time this writer 
was told by the diriM l<u > of tliree large nniversity guidance centers tliat 
tlw'ir pirrentage of SNIVs ranged l)etwe*»n eighteen and twenty-two. 
The> did not use tlie screening interview as a structuring technique. A 
n»eeptinnist aski^J the client to fill out some forms and go directly to 
th«' lirst sti'p of tlie servici». which usually was testing. 

(b) CliiMits nnist be given tlieir test n^sulls in circumstances wliieli 
would Im' agre<H| upon by experts as containing the maximum in the way 
of learning conditions. 

(c) Test and other data must be fn-attMl as confidential and belong- 
ing to thi* client. Tliis approarh helps him l<i feel more resp<uisiblo a!^ 
an a<'tive collalNirator in the pro<*ess which is designt'd to help him. 

fd) The n lationships lM»tw<»en the client and the profinisionals wlio 
ar<' helping him nuist be built upon the assumption that he will be 
treati»<l as a p«Tson. in most instancies, capable of making liis own dt*<'i- 
sions without pri»s.^iire. The ct)ncopt of consequences takes on real sig- 
nificance when he K'arns tlial his own di^Lsions as to occupational 
choices will Im» a(»c<»pt<»<l evru though they may be unrealistic. 



73 



76 



1955 INVITATIONAL CONFERENCE 



(v) In order to be permissive, acrrpting, and treat t!;e clieni as a 
collaboralor. the stafT needs a continuous program of "n-Ecrv ce educa- 
tion. TIh? minimum by-pr(KJucls of auch a program usually are: greater 
skill in counseling, competence in using insights from psychology, and 
facility in handling sound V(X\ilional information. 

(f) As a resnit of systfniatic and well-defined steps, which both pro- 
fessional workers and the clit nt must experience in the process of using 
tests and doing lounseling. serious oversimplifications are redui-ed, if 
not entirely eliminated. For example, tlie stall* worker finally gives up 
file idea tliat lie nnist iell and accepts the idea tliat he must help. As to 
tlie rlient. he gives uf sucli notions as. "I must do what the tests or the 
psyrliologist tells me to do.'* lie comes to accept surh ideas as. 'There 
is no one single mcnpational career for eacli person." He begins to feel 
romfortal)le about tlie fact that he can expect reasonable success in any 
one ol the three to five ol)jertives which he has seh»cted. 

^g' The approaches to the use of tests and counseling techni(iues 
dev nl,..,! here have resulted in a new kind of knowledge al>out adoles- 
ernts i\s wi ll US the types of (k rnpatiorud olijertives which they select. 
These |i;.\e valuable inipliralions for ednration in general and modern 
rnrrienhnn building in parti( alar. 

I\. Qurstiitns 

Let us closf with a few quest iotis. 

1) Should not one of the testing services develop a nuxlel, so that 
a group which starts out on a venture such as the one described here 
doe^ not llounder? It seems to us that just test re>^'«arch on norms, 
reliability and vnlidity is not eiinugli. 

2) Shordd the A.P.A.. A.A.A.S.. or some similar organization with 
status iuid V ientifir know-how. provide field stall' services to help orient 
tnp echelon people in <Hlucatior» and Industry regarding the use of tests.^ 
I!.T.S. is alrtNidy helping in this regard. 

Ci) What is Ixing done to disseminate, throughout the country, in- 
fitrmation about pn)mi>in^' practices concerning the use of tests which 
have proven their merits."^ 

{ i) Would it l>e pertinent to explore carefully the values of a practice, 
now uM'd in several states, of contr ieting with professionally competent 
agencies fur the rendering of diagnostic testing services to t)e followwJ 
up with individual and or group interj>retation of n»sults to pupils, 
parents and stafT. as well as evaluation of the worth of snch contracted 
services? 

f Let us riose with the thought that. In the final analysis, the quality 
of nny ti^sting or counseling program is dependent upon the nature of 

n 



TESTING PROBLEMS 



77 



the human relationships among those involved in the rendering and the 
receiving of the services. 

If the relationships are built upon a foundation of mutual trust, 
acceptance, understanding and cooperation, the end-product of the 
program will be adequate and effective. 

A good service is made up of more than testing; it is more than coun- 
seling or the giving of occupational or educational information. It is all 
these things plus a point of view and a way of life. 

Mechanics, tools arid techniques must rciiiain means. The end must 
be a warm, cofitagious and humane program for all who honor us by 
saying, **I need help." We can do no less than to strive to achieve such 
a program. 



T» 1955 INVITATIONAL CONFERENCE 

The Psychologist and Society 



M 0 inU S S . V I T E L E S 



The title of tliis addrt'ss. Ttie Psycfiologisl and Society, is suflicieiitly 
hnmd in Hvnpv to ptTiiiit Ji Viiriety of appronches, at differnit levels of 
<liscoiirse. to the discussion of the impjict of psychology upon society. 
The approiu li tido[)t«Kl for purposes of this meeting will appear as the 
discussion proceeds. llow«?ver, in order to avoid initial misunderstand- 
ing, I miglit refer to a few aspects of the situation which I do not propose 
to discuss, even tliongh lh«'ir consideration could well fit into a con- 
feri?nc<' on t»'stin;;. 

Spirilieally. for example, 1 shall not tiilk about tests and testing as 
sncfu althouKh I mi^dit well enjoy the opportunity to talk dhoui the 
j irtels upon the individual ;uid upon stM'iety of testing programs which 
range from the diagnosis of feeble-minded ness, about which the psy- 
chologist knows something, to the use of inadequately validated clitiical 
inethinls in the identification of executive talent, concerning the nature 
i»f which little is known. 

I could show equal feeling in talking about several consequences 
wfiieh efLsne from success in undermining confidence in teachers and 
in sch(H)ls through a pul)licatinn in whicfi partial data and quotations 
are used to give cre<lence to com lusions which are even contrary to 
lliosi' reachi^d l)y tlie researcli investigators tfieuLselves. There Ls no 
d(>nf)t that tlie psycliologisi ran and does render a great disservice to 
so< iet y whi»n fie employs sucli tactics, characteristic of the propagandist 
and of journalistic irresponsibility, in dealing with even such a limited 
area of fuiman activity as tlie ac(piisilion of basic skills. 

In spile of lf»e temptation to talk of sucfi relatively simple impacts 
i»r \hv ps>( fiol(»gi>t upon Mx iety, I have cliosen. instead, to devote this 
talk to the n\ort» eomplex and higher levels of discourse whicfi come to 
Ifie forr wlii'n till* psycfiol(»trist undtTtakes to remake society itself.* 



• flic ^iirii-fdiiisr pi,rti.iii nf ih\s p«|>i»r is hik«'rt fnirii fin aildress .IrHverH at the 
tltwiriK M»«Ksi.)ii of lUe 12th f ritcriiHtioiuil <:<iii^n»Hs nf \pplirtl iN><rholoffy (fiiindon, 
l'r>ri) niifl piililishtMl iiiiilf*r th»* titl**of r^ir nrrr nhtpin in tr.iencf Srienee, 1^^5.^.122^ 
Nn. IJIHI. 1 |f)7-7l NrknowltMlffTmMit is nuuir t«» Srirnrr for p4*riniHsinn ton*pnnt thi* 
liititrriiil 

76 



TESTING PROBLEMS 



79 



Throughout the history of civilization man has been intrigued by the 
possibility of remaking thi.s unsatisfactory world into a better one — one 
formed in the image of his personal perceptions, aspirations, and values. 
In saying this, f do not havt! in mind the hroud conrt'ptualizations of 
philosophers and religious leaders, such as the Ten Commandments of 
Moses, the Golden Rule of Jesus, the Five Relationships of Confucius, 
the Four Noble Truths and the Noble Eightfold Path of Buddha, or 
othe; 'eal standards of conduct that have exercised tremendous in- 
fluence in a variety of very dilTertMit cultures. On the contrary. I am 
referring to detailed plans for reordering the formal organization of the 
community, for spelling out the structure, the details of daily life, and 
the sptKrific patterns of individniil form and conduct. Exemplified early 
in Plato's Reimblic, such projects have, llinnigli the writings of Sir 
Thomas More, made the word ulopiu a counnonplacc conception in the 
languages of the world. 

Plans f<»r creating similar scats of "ideally perfect society and political 
life" (I) have come from a variety of sources. Literary riien such as 
Samuel Hutler (2) in England. Edward lidlamy (3) in the United Slates, 
and in a sense (l>rano de Bergerae (1) in Fnince -to mention only a 
few — found means for describing the inndequiu'ies of civilizations known 
to them and fertile outlets fc»r their imnginalions in the <lesigr of fairer 
worlds- in the pursuit of the perfect way of life, or in the words of 
Matthew Arnold, of *'sweeln<»ss and light** (.1) as a way of life. 

Entil rweritly the archite<*ts of Utopia have, perforce, found it neces- 
sary to accept man as he is and to silti^;fy themselves with m inipulating 
his environment and his institutional relationships—primarily economic 
and political— as a way of remolding the world and, as the great son of 
lie Persian tent maker wrote, bringing it "nearer to the Hcarl*s desire** 
(6). It will be nK'alle<J that Rousseau, in fact, look the position that man 
himself - nrt/*^ra/ man-is a noble creature, c<»rriipte(l only by the 
artificial and degrading civilization inifKiswl on him (7). The nldpia^ of 
Houss<'au and <»f his literary <lisciples such as Chateaubriand [H). w» ''i» 
thus quite (Consistently characterized by a rejwtion of ll.o ^r'* ' 
trappings of soHudletJ civilizwl life an<l a return to primitive* ( 
existence. 

Utopian Engineering by the Psychologist 

Today, by contrast, the creators of a "brave new world" undertake 
their task with avowed capacity actually to remake man himself an<l 
thereby to achieve the states of inner and ouler perfijction which, in the 
past, were pro^' sed only in the afterlifp. As illustrated in the satirical 



^7 



80 



1955 INVITATIONAL CONFERENCE 



novel hy Aldous iluxley (9), biology furnishes the mechauism for modify- 
ing inherent and supposedly inflexible characteristics of the individual 
by rminipulation of the embryo itself; physiology and psychology pro- 
vide the t<x)ls fur early ;iiid complete conditioning of the individual to 
u maii-niude world of perfcM ted order. 

The application of such psychological tools for thi« purpose finds even 
more cunrrete expression in the creation ol Walden Two (10), a new 
Utopia designed by the outstanding American psychologist Burrhus F. 
Skinner. Here, with nnhonndcd faith in the capacity of a science of 
human behavior to rhaiigi' such behavior. Skiimer subordinates "natural 
man" tf) thf sorially adaptive and conforming inlluences of scientific 
metlunlology. 

Skimier s approarh to a new Utopia is t'pilnniized in the answer given 
by the foand»*r of W atdm Ttm to a question bearing on ihr failure of 
earlier attempts to rslablish perfected centers of community living. 
1 he erncial fault, he j mIhIs out, was the ab.seiire i)^ fhsyrhohgival manage- 
nieriL " The cultural |)attiTn was usually a matter < f revealwl truths and 
not of)en to experiuieutal incKliliration exrepl when conspicuonsly un- 
succe.ssfnl. The connmmity wasn't set np as a real experiment, but to 
put certain prineiples into practice. These principles, when not revealed 
l)y r;<Kl, llowrd from a philosophy of perfectionism. Generally, the plan 
was to get away from government and to allow the natural virtue of 
man toas^si rl itself. What more/' adds Frazier, the fictional protagonist 
of the uew Utopia, "ran >on ask for as an explanation of failure?" 
(10, p. 129). 

Beliefs nndt»rlying this approach find expression in Skinner s scholarly 
writings, particularly in his lM)ok Svienve and Human Behavior (II). It 
is here that Skinner rornmits himself to the view that the deliberate 
manipulation or control of cultural practires and human behavior is 
a necessary feature of aiiy rivilization and the road to progn^ss toward 
a better wav of lif»'. It is hvw also that he formulates survival as a cri- 
terion in evaluating rontrol prartice.s. I.ikewi.se, the crucial role assigned 
to a scit nci' of human bt»havior in relation to controlled cultural change 
is made apparent in this text. "We have," he writes, "no reason to 
believe that any rultural prartice is always right or wrong according 
to some pr'nriph' or value reganile.ss of the circumstances . . . Science," 
he add.s, '*helps us in deciding between alternative courses of action by 
making past ronsequenres effective in determining future conduct. 
. . . Ih ' formalized experienre of srienee. added to the practical e.x- 
perienr- of the individual in a romplex set of circumstances, offers the 
best basis for cff(X!tive action" (11, p. 436). 



TESTING PROBLEMS 81 

It is noted by Skinner that experimentation involving control of 
cultural practices may yield findings that are distasteful to Western 
thought, which has emphasized the importance and dignity of the 
individual and the philosophy— accepted, according to Skinner, by 
many schools of psychotherapy— that **maii is the master of his own 
fate" (12, pp. 4^68). **If;' he concludes, "science does not confirm the 
assumptions of freedom, initiative, and responsibility in the behavior 
of the individual, these assumptions will not ultimately be effective 
either as motivating devices or as goals in the design of culture, . , , 
We may console onrselves with the rellection that science is, after all, 
a cumulative progress in knowledge which is due to man alone, and 
that the highest human dignity may be to accept tlie facts of human 
behavior regardless of their moiiieiitary implications'' (II, p. 449). 

lmpli< it in this quotation is the view that this approach involves no 
vahie judgments by the scientists who conduct experiments in controlling 
< (dtnral design and nuwlifying human behavior. In fact. Skinner else- 
where states explicitly that "our problem is not to determine the value 
or goals which operate in the behavior of the cultural designer; it is 
rathrr tot?xamhie the conditions under which design occurs'' (11, p. 433), 
I lowfver. it iUji^s not seem clear, at least to me, that Skiimer has adhered 
to this position. In spite of his assertion to the contrary, the choice of 
survival as a criterion for evaluating control, and the choice of a science 
of human behavior as mediating mechanism in deciding with respect to 
idtrriiative eourses of aetion. appear very clearly to be value judgments. 
Furthermore, with the literary license allowed to the novelist. Skinner 
in W alden Two has exercised wide latitude in this respect and thereby 
has revealed the dangers that ari.se when, in a life .situation, the psy- 
chologist dm»s. u\ faet. implt^ment the view that his science makes him 
the ;in hitect preeminent of the Utopian way of life. 

ThtTt* occurs, for example, a discu.ssion of the community educational 
prograiiK .\ visitor, named Castle, raises a question concerning student 
motivation. **Why." he a.sks, "do your children learn anything at all? 
What are your substitutrs for our standard motives.^" 

To make elt»ar thr issue iunler consideration requires, unfortunately, 
a .s*»mewhat Irngthy quotation from Skiimer s novel, which go<»s on as 
follows (10. pp. 101-102). 

" *Y '!ir ''staiHlard motives"— i-xactly,' said Frazier. 'And there's the 
rub. An e<lucational institution spends most of its thne, not in presenting 
faet.H or imparting trehuiques of learning, but in trying to make its 
stmleuts learn. Ft has to create .spurious nee<l.s. Have ><>ii ever stopped 
t^> analyze them.^ What are the "standard motives," Mr. Castle?' 



/.9 



82 



1955 INVITATIONAL CONFKRKNCK 



• *I must adinil thry'n* not v(»ry attrurlive,' said Castlr. 'I supposr 
Ihi'v consist of frar of oih»'s fnmily in thr event of low grades or expulsion, 
the award of grades and lioiiors. snob value of a rap and gown, the 
rash value of a diploma/ 

*Vcry go<>d. Mr. Castle.' said Krazi(»r. 'And now to answer your 
(piestion — our substitute is sifn{)ly the absenee of these devices. We 
have had to uncover the worth-wliile and truly productive motives. , , / 
'We made a .survey of the motives of the unhamperal ctiild and 
found more than we could use. Our engineering job was to preserve 
them by fortifying ttie ctiild against discouragement.', . 

Following a description of ttie use of *'cotiditioning" in building up 
toleranre to discouragement, the founder of Walden 7 «'o goes on to say, 

'Building a tolerance for discouraging events proved to be all we 
needed, . . 'I'tie motives in education, Mr. ('astle, are ttie motives in all 
human betiavior. Kducation stiould be only life itself. We don't need to 
create motives. We avoid \\w spurious academic needs you've* just 
listed so frankly, and also the escape from thnuit so widely used in our 
civil institutions. . . We don't need to motivate anyone by creating 
spurious needs.' 

Skinner uses here, of cours<', a d<»vi<'e conuiioiily employed by botti 
literary men and ex[H»rt propagandists in lulling the reader into at least 
ttie provisional acceptance of his vi<»wpoint. It is that of molding atti- 
tudes by the ctioice of appropriate adjectives, illustrated in the quo- 
tation by ttie phrases '*ttie sncrh value of a cap and gown," "the cash 
value of a diploma." ami most of all hy the repeated reference to 
'\Hpurious needs." 

Social Srieiieo utul Social Keform 

Ttie last of Itwse ptirases. "spurious needs," brings into relief the 
situation that tias pnKluced both ttie title of this address and its content. 
This, briefly, is the increasing tendency on lln» part of the psychologist 
(o inject value judgments in a manner that makes it increasingly dif- 
(icult, especially for ttie layman, to determiia* when the psychologist is 
dealing wilti facts and principles derived from exp<»riments. or when he 
is merely presenting tiis own value judgments (\:\). It tins, in other 
words, become increasingly dillicult to know wtien ttie psyctiologist 
speaks with the auttiority of sc ience, or wtien he is playing the role of 
the sm'ial refornu'r wtule clothed -or even disguised -in the garb of 
the scientist. 

In saying this. I am. naturally, not denying the right of the psy- 
ctiologist to his opinion to tiis own value judgments. He. as every other 
free man, is entith^l to believe that a cap and gown is, indeed, a stigma 




Ti:STL\G PilOBLEMS 



83 



of snobbery ; that a diploma is prized ouJy for its cash value; that money 
is crass ; that, as Rogers believes, religion, and also Freud, are to be 
rritirizcMl for permeating our culture with the false concept that man 
is sinful (\2); tluit prejudice and discrimination are used by dominant 
groups to defend th(»ir vested interests (I t), and so forth. As a citizen, 
the individual psychologist is free to express any such opinion, regard- 
less of how uii{N>piilMr it may he among his professional colleagues or 
among the mass of people in the culture of which he is a part. It is not 
hi.s privileg(». howevi^r, to <'lothe the source, and personal nature of such 
opinions in the language or form of scholarly writing to the point wherti 
it would appear that they are the o:r/cume of scientific inquiries. 

Heference to Wulden Two as a devict; lor presenting this issue does not 
refh'ct. the opinion that SkimuT has been particularly remiss in this 
n«sp»M t. in comparison with other psychologists. This fictional representa- 
tion of his personal views by a riotable and conscientious scientist merely 
provides a springboard for the discussion of a major issue in psychology. 
It is an i SUP that grows in significance with the multiplication of pub- 
liciitions wheiv tiie fuihire to distinguish between conclusions supported 
by experimental evidence ;uid those representing prTsonal value judg- 
ments becomes a nuMlium for the support of cultural practices or changes 
deemed to be desind>le by the scientist. 

The frequency with which this occurs lends support to the opinion 
that many psychologists huve reverted to Plato's conception of method, 
.js stated in namely. "This was the m<?thod F adopted: I first 

assumed some principle. whi< h 1 judg(»d to be the strongest, and tlien 
I afiirmed as true whatever seemed to agree with this . . . and that 
which disagreed 1 regarded as untrue." The fa< t that, in uiost instances, 
the individual psychologist is not engagi'd in the patterning of an entire 
Utopia, bul rather in what Popper in The Open Hoc iely and Ih Enemies 
(I.V) has called "piecemeal social engineering," does not diminish the 
seriousness f.l the situation luider discussion, especially in an era that 
has raised the psNchological expert to a level of considerable influence. 

KsHavs in Pieenneal .Social Knginoering 

-Many examples of this situati<»n can b<' < i ' A thought-provoking 
arti< le by Cirdner Murphy, entitled Ifimui . 'olenliulilies, furnishes 
one su<'h illustration. Here. .Murphy formulates live basic principles 
for -pernu'tting the discovery f>f human potentialities." irichiding among 
these, as a negativ(^ principle, lo avoid (he rompelilire. "Not," he wrote, 
•'because competitiorj is always bad. but because it frustrates and be- 
numbs those who fail. an<l be< ause for those who succeed it can at best 
give oidy the ever iferaterl satisfacti.)!! of wiiun'ng again and again. In 



84 



1955 INVITATIONAL CONFERENCE 



this direction lies, of course, a convenient way of maintaining a status 
minded society; but I am speaking of something quite different, namely, 
the release of human potentialities" (16). 

Accepting Murphy's statement that he is interested primarily in the 
release of human potentialities, there still arises the question whether 
there are, indeed, facts available to support the use of the word principle 
irjstead judgment or opinion in the context of his statement. Further- 
more, the reference to **status minded" society introduces at least an 
implication that ^^competition" is a soc ially undesirable practice, as well 
as a handicap to the full and healthy development of the individual. 

Examination of the literature — particularly that of social psychology 
— indicates that competition is quite frequently treated as though it 
has been demonstrated with considerable certainty that this is a noxious 
cultural practice. In addition, by associating capitalism with competi- 
tion, onus is reflected on the capitalistic system, as compared with other 
and, by implication at least, superior economic and social systems. 
Thus, according to Newcomb, the higher frequency of exposure to 
ill lire, threat, and insecurity that exists where importance is attached 
) rompetitive success makes it "no wonder that psychiatrists like 
Alfred AdlfT found feelings of discouragement and inferiority prominent 
in the neuroses of Western society" (17). In a somewhat broader con- 
text. Asch states the requirements that distinguish between a **society 
of atoms, each arrayed against all, organized on the predatory principle 
of homo homini lupus and one organized around the idea of a community 
of men." The former, it is made clear, is one built on the "calculation 
of private profit." Only an inferior brand of social organization can be 
anticipated from an **<*go-centered thesis" that "describes the balance 
a<hiev(»d in society as an uneasy and antagonistic mutual limitation 
of earh by all" and that "rediires every trace of solidarity to the pattern 
of Halations in the busirioss market" (18). 

How many facts, from how many studies, are available to support 
such judgments with respect to the irulividual and social roles of com- 
petition? Newromb's refenMire to Adier's statement concerning he 
frequency of neuroses in Western (competitive and capitalistic) society 
merely raises agairi the perermiul questions concerning what constitutes 
neur<Ksis": concerning the amount and quality of research underlying 
psychiatrists' dicta, and even concerning the nature of the sample 
observed by the psychiatrist. The last of these questions is neatly dis- 
jiosed of in the reply given to the query **Whom has the psychiatrist 
been observing?" in a humorous but nevertheless challenging book 
entitle<I l!(nv (o Lie ivilh Slatiatics, "It turns out," it is pointed out, 
**that he has reached this edifying cnnchision from studying his patients. 




TESTING PROBLEMS 



85 



who are a long, long way from being a sample of the population. If a 
nuui were normal our psycliktrist would never meet him" (19). 

IVrhdps the situati(jii with resjMct to research on competition versus 
eo<jperation is n(jt (^nile as \>iul as this. However, the fart remains that 
studies beariiif: on th • t fFects of competition on the individual and on 
irroups arc fi w in number. Furthermore, tlie .size and nature of the 
san ph-s invdvi-d i-i Ma li studies, the restriet^-^i and fre(iuently arlihcial 
sellings in wliieh lliey ar.- roiKiuet^'^i. the niai:ipnlation of tfie<jretical 
roiH epts and experimental \ariii[)les. and sti forth, make it (juile im- 
p<jsMbie to derive l)rMad value judgments pertaining to the role nf 
rompetitiMi in Mxial j rogress. Available experimental findings do not 
provide gr »iiiids for dis< aniing lightly the opinion, expressed in a 
prophetic- di vsent by Justice Holmes of the Supreme C.'ourt of the United 
Slatr s. that competitiiin ^Ix-tween groups as well as between individuals) 
H a s(M iai advantage >inre it "is worth more to society than it costs" 
:!0). ('.ertainly. flie [j\fM»thesis iliat romptMition reaching even the 
dinM-nsion> of ((inlliet eontribut* s to iiuiividual and group progiess 
raniiiit Ur abandoned. This, in fad, is the position taken with respect, 
at least. U, the s<K ial role of conflict in industry by a number of con- 
tribut(jrs to a reient btMik. IruitLslrial CnnjUrt, edited by Kornhauser 
W al. r2\). 

'I [lis n»f.«n iic»- to indeistry brings t(j mind another illustration of the 
presentation of value judgments unsupported by facts d'^rived from 
n-i»'arrh. There has N-en ( onsiiirrable thou^bl givc:» io the role of the 
iuiion. in comparison with that i>fotfier social organizations, in providing 
•'MibNtitute" satisfactions for warits and ne*ds Inat are presumably 
frustrated by the j,>b conditions under which pe<jple work in nKKlern 
industry. Writing witfiin the context of a s« holarly work. Krech and 
Crutcfdield state with « «)n ii tion that "///e /aW union, hy and large, 
ran Mter meet mosf of ti/*- uffrkrrs' turds and dernandi tluin can oilier 
onjuniznlums As we have sei-n , . . mo^i • , , j;,| organizations will generally 
r.'flect the major iiei»ds of its members, and labor unions will th<»reforc 
be mon- tiiilored' to till- need^ ..f the workers tlian will religious or- 
jjarn/ations or otlirr k s^ honioi^'erw-oiNly ct.mpo I s*M i./! < ir^iini/ntions" 
yi. iJijIicN inilir 

'^'^^^ ^h.- Iiriif lliis stal'-rnrnf appear-d. tliere Was little avail- 
al)le in lh»- \\.i> ol" rrMMrch lunlings braring on tf»e workers* perception 
iifotfii-r >oi i,H orL'aiii/atiMMs 'apart rioiii tfie industrial ()lant) in coni- 
pariv.i- n'tl' ihi-ir p^-rceplion of Hu- union. So bir as n-ligious or;;.Hii/a- 
tions n roii,rr/;,.||. ifur.' urre not. to m> knowledge, any farls tfiHt 
would siq j)ort or disprove the conclusion reach. -d bv Krech and Crutcli- 
tield. 

H.] 



86 



19:>5 INVI FATIONAL CONFERENCE 



Studies conducted Aintx 1948 do not show that workers themselves 
perceive the union as the prime medium for satisfying most of their 
iit eds. Thiis. in a stud> of u teamisters union, by Rase, 75 percent of 
monihrrs n^ferred to **f^etting higher wages/' and 31 percent to gettiiif^ 
"job sr<'urity/' as a purpose of the union (23). !Vo other single purpose 
is menlioned by as :hany as '20 f}ercent of the workers involved. Simihar 
lindinf^s. in <ith<T stiidii^s (i«*alinf; with the worker's perception of i\w 
uiiion (21), likewise' ihrow serious doubt a the view t'.at ♦! "uiion 
dfies or can siitLsfv the iu»nis for participation, for self-t*:r. • for 
srif-rt'spect. for status, or a host of other psychological an ; • eds 
bi'ltcr or more fully than (*<> other types of social organize-.: . 

There is still little, if any. evidence braring specilically on the question 
wlirthiT labor unions can or will be »nore '^tailored" to the needs of 
workers than will religious orgciiiizatioris. !t srenis true, as Kntch and 
( 'rulclifirld contend, that unions are, in Utri, i4^;suming acc<?ssory func- 
tions of thr type ^jat enlarge the |>o(ential for Uie satisfaction of more 
.in(j irmrr n^rds of its members. A>v is ula^> jH)int<-<il out by Krech an<l 

< Irutrhlield, this is liLt-wi.se ♦rne of religiot/S orL'aiuz;fti()ns. They provide 
no ♦ M(lence tb.i* oim is d<;ing this t > a greater extent or witli b«»tler 
r«-sult> than the (.:Jier. Furth» rnHirt, 'though current reseai^ h on dual 
loyalhi'S for example, to the uniriik -ind the religious organization 
points to thr fiiet thai, ea^ h organization may bell T satisfy some 
specified Hi t <!. lindings do not in ar»y sense sett];' the (]U(^tion whether 

< ifh»'r or r:\i\ Im* * '' t "tailored' t<: [^r<i^ i<h' din'ct or **substitut<''' 
satisf.icf i(fn> for rin -i iteeds. 

in u>ir:g t(:is illustration. 1 am not. for tiie niduient. Cijncerned with 
the I'valuiitioti of thr role oi «'ith*'r the union or if religious organizations 
ill ih" lifV ;.f the individual and in minirir' f,o< i Ty. I am concerned with 
tirjittnent »)f Mie roles of Miese .tnd of otlti-r :^<K'ial organi/ations by 
psjeliologist i in a rniinner that (•<»!irusiv» tli*'<)ry or value judgments 
v^ilh fnrts in a manner that nwiy. with <»r N^ithont intent, mold the 
;ittihjd«'s oi' th»- I'rjidn or student with respect to serial instituti<ins 
ratl< r thiin enlighten htm with respect to their roles as revealed by 
n v j'.rc li. The lindin ' reported in a recent study by Keehn, that the 
resrrnhlarice v^ithin a group of well-known psychologists ( .V= 27) was 
rontinrd to high homogeneity with respect to a contiinmm of 'iiumant- 
larianisin and anti-religionism*' ['17}) perhaf>s lends special pertinence 
to (be illustration inider eonsitb'ration. 

Many illustrations of premature and also biased generBlizations from 
ffdatively little in the v^ay of facts are to Ix* found in industrial applica- 
tions of psycholotry that, as may be suspt'cted. are of special interest to 
rne Tln]s, earlier discussions of the eflK'ts of repetitive work, and also 



TESTING PROBLEMS 



87 



current discussions of automation have sufTered both from an absenre 
of liiiitorical perspective and from the "naturalistic fallacy" in which 
subjectively deterraiiKxi goals and moral values are confused with the 
empirical methodology and outcomes of scientific research (26). 

A necessarily brief illustration from another area of research and 
application may help to reveal the wide scope of the problems under 
discussion in this article. In a volume entitled Motivation and Person- 
alily, Maslow takes the position that "science is baseil on human values 
and is itself a value system" (27. p. 6). Acting on this premise, he has 
described a Utopia, called Eupsychia, characterized by the fact that all 
men are psydiologically healthy. Mss<-.,tJ illv, accordinj; to Maslow. tlris 
mearts tliat "the inhabitants of Eupsychia would tend to be permissive, 
wish-respecting, and gratifying (whenever possible), would frustrate 
only under certain conditions . . , and would permit people to make 
fn-e choices wherever possible. Under such conditions.'' adds Maslow, 
"the deepest layers of human nature could sliow themselves with great 
case" (27. p. 350). 

Mere Maslow appears to accept wh:it Skinner Um desrril>ed as a 
donunant view chanictcrizing the tht»<iry and prarticv of psycliotlirrapy 
^expn-ss^Ml . arlit-r in the primilivism of Uousseuuj. namely, tliat man is 
essentially gofxl and kind! and «s cf)rrnpted only by S(jcial forces imposed 
from without. Thus. I\og*'rs. the higli priest of psyrfiotherapy, takes 
issue with Freud's view (^8) tliat man's basic fiature- ,he "is pri- 
marily made up of instincts which wonld. if |^ rmitted exp' cssion. r<jsult 
in inccsi. murder, nnd othr-r cnnu-s " (12. p. 56;. Tlie contrary. Hogers 
contends, is the fact! **()nc of the most revolulif»nary Cf)ncepts to /.'row- 
out f)f our clinical expe rience." he writes, "is tiie growing recognition 
that the innermost core of man's natnre, the deepest layers of his per- 
sonality, the has*- of his 'animal natnn*.' is jK)silive in cfiaracter -is 
basically so<-ializ<fl, forward-moving, ratioiinl and realistic" (J 2, p. 56). 
The go4il of psychothernpN therefore naturally becom»K- : i of pro- 
viding a client-centiTed. permissive atmospliere tliat leads ^^djiLsirnent 
through the revelation -by the individnal U) himself- of the ess*Mitially 
"self-preserving ami s(H'ial inner core ' of liis personality (2'>). 

Which of the.se views - that of Freml. or that of Hogers - can wi- 
accept as scientific truth? In what measure are the treme/ulous struc- 
tures of psychoanalysis and psychotherapy built on a foundati(jn of 
empirically established facts? And t what extent c:in we ac/ vpt adjust- 
ment its<;lf as a prescription for living "ms a socially desirable goal?" 
Or is tlieir justifiratioiT f*>r ^Jndner's view that the whole concept of 
adjustment "is a mendacious lie. biologically false, philasophical!.; un-j 
tenable, and psychologically harmful" which, according to Lindneij 



88 



1955 INVITATIONAL CONFERENCE 



"disregards many if not all the pertinent facts of human nature'' and 
represents "an untruth thai is rendering man impotent at a time when 
he netids tlie fullest mastery over his creative abilities" (30). 

The ScientiMt and His Moral ValucH 

Whether this is true or not (31), the sad fact is tJiat the immense 
superstructure of psychohjgical practiee often rests on a foundation of 
scattered, splintered, and tifiderlike data that could fall apart with the 
iFMist meaf^er ess;iys in the way of further exploration throuj^'h the use 
of available S4:iefitific t«?chniques. Psych<jloKists and psychiatrists alike 
setfm loath to acknowhvf this. Only Uhj often we seem possessed— not 
by an appropriate and deep sense of humility— but, instead, with ufi 
urpj to substitute our value judKmetjts— frequently uncontaminated 
by facLs~-for those held by others and as perhaps expressed by colleagues 
in relat*'d lields of eeononiies. history, political science, philosophy, 
rrli^'ion, and so forth. Like Sraphi<j and Phantis in the delightful comedy 
/ topia Lid. by \V. S. (iilbrrt, we .seek to entr-r the world of alFairs to 
till- voice of a 4:h<jrus that sings (.32) 

"() make way for the Wise Mm! 
They are f>rii!eni(*n 



TheyVe the pride of I topia - 
r.oniucopia 

\< each in his mental fertilitv. 

() they never ni;ike a blunder. 
And no wonder, 

I* or they're triumphs of infallibility." 
It is possible that in this paper and dso in rny earlier publications — 
I riiay appear to have r |(,th(»d my.self in the mantle of the "wise man." 
It is untpjcstidnalily evident that nujch if n(»t all that I have said here 
is ill fhe nature of value jud^mients. In (act. I make no claim to the 
scjrntific jnithentieity of my jiidgnuMits. I nrthernion*. this article do<»s 
not purport to set irp a seientiii*' system of moral values, or e\ -n to 
sup{M)rt the jxjsition that this can be (h)ne. 

N»*vertheh»ss. moral values are involved, and these re(|uire serious 
lii(Migbt wh(»i'r\i psy< hologists turn their attenfi^ to rx wer develop- 
ments in the way lK)th of the theory and applications of l[)e science of 
btjmf>n behavior. This seems the occasion to recall the description, by 
riiny. of the activities of the clothiers of Home who mri in the Forum 
in the autinnn of ea<'h year and whose activities made iurati eniphr- 
let the btiyer bewan^ -the expression of bitter experience ofi the part 



TESTING PROBLEMS 



89 



of the Romans (33). The very fact that the infant science of human 
Mjavior can already make important and useful contributions to human 
Hr-lfare does not entitle us to play the role of the architects preeminent 
of the new Utopia. 

We are not privileged to let our individual moral values—instead of 
hard facts— set our standards of conduct as scientists. We cannot con- 
scientiously permit even a despair of Onding ethical absolutes to lead 
us, in the words of Keckskemeti, to **smuggle them in behind intellectual, 
psychiatric, and political screens" (31). There is no time better than 
now to recall the forceful appeal by A. V. Hill that 'scientists should be 
implorf;d to renu'inber that, however accurate iheir scientific facts, their 
moral judgmefits may conceivably be wrong'' (35). Let us take pride 
and courage in the dedication of our work as 'icientists to the cause of 
tP^*-ikind— to defending and enhaficing the worth of the human being 
p. 3ri). We must, nevt-rtheless, simultaneously keep cofistantly in 
the ne<'essity for clearly s^epiirat' our thinking and wishes with 

i>ect til ord'MJiry aliaiis from the "rritiral habits of thinking" (35) 
L/)at rharai t* i izr the true scienti-^it and establish the inherent integrity 
of a srie }cr. 



WVJ'i: tKNC.Rs AND NOTMS 

t'-urhrtti Shiiuinni lUctumu.y iKuiik tiiid Wnffiitjlls. Nrw York. 192t 
v \KuX\rT Hrf'uhm {\n:2: o print,, I !>> \UW.,, N.-w Vnrk. lOIT). /-W^,,, 
l{frt.ihf'tf fltif'harits. I^umi, ». lOOI). 

K. i^ Hiimv. }:;,tt/ rutvkmird fH lll^hto|| MiHIiii. Hostoii. UNH). 
(:. •!" l^•r^..ri..^ // t Ur^ ^/ /.v,,/,|>^j, /^„,,,. ,y,, 

. !:..».): .r|.niitMj \^^ ' ( ,1,, l^ivr.' li.- FrMnrr. \rw V.>rk. 195:}). 
iVw-V'' !'*^* ^ • '^^ ■••'^ r. I.rin hy Mitrmilhiii. NrW York. 

iV,,/,M,v.:/ ./fWir Khnyy.un "u . h m Sv V.. M, riiM. rW |. |a.V>; ri'printiMi 

J- mii.;m.im. '>nr*,,:r. jfwr a-., Xrim-v 5 W /rx ..lr/.f (lT:,ii. r.'Hrint.Ml hy Oxfnr i 
purm. !t4 t.iffunu^i 'IT.Vi; ni^rti '.^ti fis iu:„ri\ I'liis. |Vs^ \rw > ork |y'''>)- 

^' }l (t^«»<>K 'l^iM :IH(M) /r^ O.Umt 1800) 

Iin,»lish.-u HI Orrirvi <MTupU>yt dt i 'h ,:»\:nhriii:iil ( .j,rN" i. !>.m.,. nU.ul IH.'iO). 
\- iluJ.'y. Hni,^ AVr.. UVjrW (DuU> Uu«y iJ.inik \. w 
U. K. Skriiit^r. \\al(*'n 7V/ (Miirmiliufi, Ni^w York lOiH) 
' . svh'n,' and //u/m,.;; hrfiurtur ( Mnrinillaii. Nrw York. 1'^:*). 

f^''-.l»»''i< *i^«rly C. h. rtonrrs. "S^Mn.' dirrrt'oiw .>nii in tJi.-orv " .u 

O. is. \.nv%r. r. !'tyc\.^fieraf.y: Thmry ami thsrnrrh (Kdi.uKi 'v.-s^. Surk 
! v;j J ) 



^7 



90 



' iVlTATIONAL CONFERENCE 



13. It is apparent tliHt I here (as ahM) elsewhere in this paper) distin^ish between 
fact and raltte Hud, at least by inference, reject the view, appearing in current 
discussioi'M of theory of knowledge, that facts are in themselves value judgments. 
Xctnally. I do ru4 atrept t'.it? view that existential and normatire propositions 
lire e<|niMileiit (hat 8cienf>/ic and ethical statements are basically similar [Cm. 
I.iiiidl>4'rf?, "S-niaiiticM and the value problem," Stxial Forces 27, 114 (1948) ]. 
Uy etintnist. I am inrlined Ut accept the view, as exnre;»sed by C. KJuckhohn, 
(hat althou^li e\is(ciire and value are intimately related and interdependent. 
Oiey are "jit leiist at the aiiaUtieal level — conceptually distinct." However, a 
drlailed diM iis-ston of this <'ontro\<*rsy is not appropriate in this paper. The 
rraclrr iii(«'n*s(i'd in n detiiiled <l;^'u.ssion of thnireti^al (vjnsiderations in thi?; 
iireii is reft-rn-d to piililitrntions cited here, particularly reference 26, and, in 
iidditioii, to II rhapter "Values nnd value orientations in the theory of action: 
nil exploration in ilriinition and clnssification,** by ('.. Klnckhohn W a/., in Toward 
n fivnrrdl Theitry Action, T. I'ars«)ns mid E. .\. Shils, Kds. (Harvard Univ. 
Press. (:an>brid>?e. Marvs.. 1951). pp. 399-t33. 

U, S'#- particularly CI. \\. Mlport. The IS'ature of Prejiutirf ( Addison-Wesley, Cam- 
bridgi*. Mass.. 1931) ai d (i. Saenger, Social i*sychfh.^jy of Prejudice (Harper. 
Nrw V<jrk. I9:,3). 

ir». K. M. P«)pjM-r. Tfw 0//. .'/ Society and Its h.'neniies (Princeton l.'iiiv. Press, Prince- 
(on. N.J.. I9.'»0). 

\(r (i. Murphv, "I Ionian r»)teiiliHlities." ,/. S^jc. Issues Snppt. Ser. So. 7 (19.'*3), 

IT. 1. M. !Ne*^r««inl». Sftciut !*.rvchtttHfy (Drvden, New York. I9.>()). p. 27. 
IH. .S. i:. Asth. StH'inl f*sychotoijy (Preiitice-flall, New York, I9r>2;, p. 316. 

19. I). Iliiir, How to l.i/with .Statistics (Norton, New York, I931), p. 19. 

20. H. AariHi. **( 'hjinpiij; lejjal <-onreiiLs in industrini conllict *' in A. KornhaustT. 
It. Diibiii, A. .\l. \U»ss. J ndnsf rial Conflict (Me(;ra>%-Iliil. New York, 1954). 

Jl. \. Koriihiiiis4T. n, Diibin. A. Moss, trid iilrial Conflict (Mc( Iraw-Hill, .New 
>ork. 1954). 

22. I). Krerh and It. S. ( !riitehtield. Theory (trui l*r(}f,:fnis o/ .Social l*svchottxjy 

fM((iraw-lIill. Nrw Y<»rk. I^MH), p. 5t8.' 
2.\. A. .M. lilts*-, t nion Sattilarity (I niv. of Minm-sota Press, Minfieapolis. 19,52). 
2\ S«'i* .M. S. \ ill I:*". Wo/'* ii/ton and Morale in Industry (Ndrt»Mi, New York, 1953), 

Chap. lit. 

J5 .1. I>. Krchn. ' '111** r\|jirs.s«*t| social nUitud<>s of Irnding psycbtilojfisUs." Am. 
I'syrhnl. 10, 2\m f 195.'.). 

2h. >i'r partieiilarly A. I. Ayri'. iMnynatjc. Truth and l^^^fic (New York I;niv. Press, 
N»*\v ^ork. I9;i0»; C*. K. .MiMtre. Artjuttwnts ofjainst Ethical Niituratism (North- 
wr^trrn I iiiv. Prrss. KNiinston. IH., 19 till: and P. heckskenieti, Meanitifj. (Uini- 
nmntriitiftn. ind Vuhif fl niv. of ( Itiicii^o Pre.ss, (!hien^'o, 19.52). 

1*7 \ II. M i->low. Mutnatifiii and l*t'rsonality (IlarfM?r, New >ork, 1954). 

2i\ ^tr (uirtlciilarlv S. Kn-tid. Cirilt:otii*n and Its tUscontctits (llognrlh. New York. 

pr>;t». 

«9 "^(iidif's »>f "frnil" children and othi-r iiivesti^uttons iMMiriitg on (lie "ess4*nti»l 

nntiir** of ifuni" iin* '<iininiiiri/ed in a n*ernt voliinit* ^-■ < !, ii«*iil>a, Tfie !\atural 

Mttit (I)oi|[tl«'day Poriiil PnjM-rs in Psychology, 1954). 
Ml \\. Lindner. I^rrsiriptmn far lU'tH-llittn (llineliart, New York. 1952), p. 12. 
ill l.iiidtit'J might lind '^iipfHtrt for bis views on the nonr>*sis(4int ndjiisti'd man in n 

rrrt Ti( cofiipnri'son. hy (fir i)io{ops( II. W . Stunkard, of the sources of di'f^enrra^'V 
p}ii(i^i(ic aoifiiiils iind inhahitjinls of the stM'ieties of iints and lM*t*s witli 

(.h»' los-. of frt '-ilnni In the ciiK -controlled welfare state of iimnkind ("Freedoni, 

l».»iidiv< niid (he \*' ifjire s(ii(e.* .Sriencc 121, 811 (1955) \. 
'\2. \\ (lillierl. IHa\.i and l^fjenu (I'wr-wloin IIous*-, New >ork, 1935), p. 588. 

.M. Heard. .4 II t.^ln^ of the Hiisincss ^ntn i\Uinuil. ;\. ?^vw York. 1938). pp. 40-1 f 
»t. P. Ke< k.sk«*ini'ti. " f he psvciiolofjicul therjry »)f prejudice." ( 'ontnwritary l-S. No. 4, 

359 I •>.54j. 

^5. \, \ . Hill. ' I lie ^oriiil r<«sjM>iHibili( y of s<-ientists,** Hull, Atomic .Scifnlists 7. 
371 M95I) 



PANEL DISCUSSION 

Clinical vs. 
Actuarial Pre<iiction 



01 



ERIC 



TESTINC; PROBLEMS 93 

Clinical And Actuarial Prediction 
in a Setting of Action Research 



NEVITT SAN FORD 



When I first looked over Paul Meehl's moving account {2) of his 
coiiflu-U) alKJUt clinical versus statistical prediction, I thought of cnurs.- 
I wi,uld be able to say something helpful, something which if not thera- 
p.-utu- would at least be comforting. I myself had not been troubled Ly 
this particular conflict, supp'-sing as I did that statistical prediction 
was merely a tool for denion.s:.'atiiig, or testing the generality of, what 
one knew already. i\ow, since I have l(x>ked into this matter somewhat 
more carefully, I must admit that Paul Meehl has me worried. If there 
are not places for both clinical and statistical prediction, if tli- two 
cannot be reconciled, then I have to look forward to the imiiunciit 
splitting of my personality. Hopefully, then, but not without anxietv 
1 am trying to figure out how in my scheme of things clinical and statis- 
tical Methods are related, or kept separate, integrated or confu.sed. 

I would suggest nt the start that what divides the clinicians an,! 
the statistician.s— what they get passionate about at any rate— are not 
so much (lifferences about the best way to per'^orm a given task, as 
difn-rences in more gcn«'ral oullo<jk-perhaps even in t. niperament. Thv 
arguments are very lively to concern wh^t ought to be pr'xlicted. v hat 
predictors to u.sc. w .-i: .,vcl (.f prcii, tahilily is possible or desirable, what 
1.S going »o he don. ,1 the predictions one... they have beer. made. 

VVhal I pn.pose lo :{o now is cc.iisid. r aotw of our activities at Vassar, 
with attention to .« i. issues as (i, : and in th. light of s^nne of the 
arguments that have bcri advanced in fav(,r of the two ki.ids of pre- 
diction. 

VV..- are trying to predict withdrawal from College, by methwls that 
are .strictly actiii.<)i.l even "blindly '■-iinir cal." How can we justify 
Mils in the eyes uf our clinical friend.*, ' 

In (.iir circumstances tiiis is a far less .w^ ensive proceeding than any 
clinical one we might u.se to accomplish fljc same thing. A battery of 
sone 1100 items having been given to \ ontcring freshmen clas- - - the 
inatfer of finding predictors of withdrawni i:, i> sfraight-forwird n,arl,i/it! 
operation. VV.- do n(.t want to take the timi; t.. study ;,dnii.,M..ii^ data. 
IcHjk up students who have dropped o,,t, or to interview a saf,.p!e of 
entering freshmen ami on that ba.si .ijcs.'j who will withdraw. 



.90 



1955 LW IT.x riONAL CO.NFLHKNCK 



Mon!ov»»r, i\iv e>i(leKrr is tluit our iiicch.'uiically rmislrucled device 
will score more hits thufi would the usual "rlinirjl" procedure, e.g., 
gue.ssiiig on the l>asis of .ifi interview , or perhaps a few records and tests. 
(It will have to go sofiie, however, to do briter than the Admissions 
(ionmiiltee, who predic ted that no entering students wouhl withdraw 
and, as far as the fn,»shrnan year was concj'i umL was correct in about 
90^;; of ihe ca.s<»s. The ('omntittre proc( -^Ur linically. I suppose.) 

Consider the dinieulti<'S of making rlini* a! predictions of a criterion 
sijrh as wi ;ire < <>n^i<lering. Say thai i know the snbjerts well, rhielly 
on the basis of int<'nsiv<- interviews. I would be l>iased probably [:i 
the direc tion of leniefM V and I would be ronfusrd. I would think of so 
many hypotheses favriJUjg <»mc or the otlur ariion. withdrawing or 
remainifig in vi4'u/^- that ' v»ould \W\ <pjite lost when it <affie to the 
matter of assigning weig^?s. (Ilaroh! Webster tells me that a clinician 
who can thifik of inanv iartors wiii<-h seem to tiave some association 
with the eriterion would [>rob;ibi> do Im sI if he just gave thefu all the 
same ernde wt ight. I'm snr<* that w<' often over-weight an interesting 
psyeho' lynamic fa<-tor, or els<- over-«-ompen.satr 'or a tendency to do so 
by ^upposiiig th t inlelligi iH e is of pararnt unt importan<-e.) 

If I ki. 'w thi >nl»jeets well. I wdnhl proh;iblN be thinking about the 
relative stn-ngtlo nf variables in a given individual, and about how the 
varialihs related uoe to another, rather than about group norms for 
any (»r the^r variables. ehafK es are thaf I would know little about 

ihr ^ilnation in whieh the rriteiioi, l)ehavi»>r »»eeurs. fis it not often the 
rase, in rlinic al prrdirtit)n studirs, that th*'y invoK* *-ilher < lini< ians 
who have only vagUe noti< ibout the eriteritin or els*- people sn( h as 
de;fns t r admissions oflieers wh(» know tfir eriterion but a;e not very 
good t linii ian>;* ' \rtually. in the eas<- of an entering freshman whom 
I had learned to know wr!i, I niighl have very goo<l notions about 
wlh'ther or not I wantrd »o take him along on a ( rui^e. w hen I wouhl 
bt' ealled upon lo ardit ipate his behavior in a thousand! situatirins. and 
^lill br quite uriNcttled about sik h a di< hotomou> rriti'rici as dropping 
out versus not diopping out of college. 

So it s«'rnis to rrn- that in the present instanee it ought tr. be granted 
thai stati>liral meth.Kls ran do better. 

It sh()ulil ;ii^o \tr re(()gni/rd thai sneh a predietive device as vvc ;jic 
working on ran f-.ave litth' prartiejd vahc when it is taken by itself. Its 
applicability is i oth limited and tinbious. W »• already know that what 
Vol s for \ is>ar lors not hold for eertain othrr eolleges, and that what 
h .I'i^ ♦or fn'shrn< ri dors not ^old f<»r sophomon-s. And it is recogru/ed 
111., ir there shoidd be ehang' > in the way the eoHege manages its stu- 
'I Me' whole thing would have to br done ovrr again. 




TESTING PROBLEMS 95 

It must be recognized, tcx), that however good our statistical predic- 
tions might he. the college will not adopt a statistical policy with respect 
to ndmissioiis. 

Much of tlir ussioiujte rejection of empirically derived tests, by 
cliniciiiiis and humanists, i.s based on a fenr, sometimes justified, that 
what has been derivf^l a( tuarially will be applied collectively. It does 
seem that strong adherents of actuarial methods tend to be institution- 
centered, while clinicians tend to be individual-centered. 

If our predictivt device has little practical value, it would seem to 
have even less scientific value. I am assuming that we merely pull 
Items, and cross validate in successive groups. We establish a close 
relationship between. Irt us say, -tendency to drop out of Vassar in 
the frrshman year." as measured by tests, and dropping out of Vassar 
in the fresliman yar. We define no psychological variables, state no 
liypothes(>s. invoke no tlicory. This kind of thing is actually done. Since 
actuarial prediction of s(H ially defined criteria became the order of the 
day, the study of personality for its own sake has been rather neglected. 
I know of one res(*arrli organization that was founded with tfie object 
of studying personality but which became converted to actuarial pre- 
diction of practically important criteria, ami wliere discussion of psy- 
< hology is no lon-cr heard. Only inetfiodology is <liscussed. 

Why then do wr iKjIlii-r with tin's apparently trivial e.x-rcise? Can we 
yrt inan.:ge to drrivi- some sciiwitifn* and practical val.ie for it? 

For one tiling, wr v^ill iiu,ke it serve ai^ exploratory function. A study 
(if the scairs and items whicli separate drop-outs from remainers will 
give IIS some notions of what is going on. It will >it^ld suggestions about 
the (ollrge as well as about processes in the sindenis. Once again, it is 
a much less expensive procedure dian making a sort of anthropological 
myesligation of the college and a clhiical investigation of leavers. 

These invesMgalioris might, however, turn up personal and situational 
factors (hat did not ov-r-lap (Milirelv with those suggested by our 
"blindly empirieal" approach. 

\t ariN rate, with snch faj lors in niiixJ we wiH be in a position to 
f(»rmuh»l • ai. ? to i. sl some li> putlieses. If students with tested cliarac- 
Icristies f, b. e, ^ind d are (Iff ipi>in;.' out anUAlr^i^ A. because of con- 
ditions X and > in thnt colh-r. Hu n vvc will predi<'t that students of this 
sort will not drop (.111 of College ii. where these conditions do / >t olilain 
nor out of ( olle-r A. sitoiild these conditions be cliaiiged. 
As a matter of fact, to ,spcak of practical matters, discussion with the 
college of results snitablv interprete<l- -of the statistical prediction 
Hinds might conceivably set in motion a process of self-criticism whr^ 
would lead (o clianges in conditions x and y. 



96 



1955 INVITATIONAL CONFERENCE 



The early identification of probable drop-outs might be the basis for 
starting a counseling program that would reduce withdrawals among 
those students, if any, whose interests were better served by remaining 
in the college. If studefits dropped out despite this counseling, one 
should have a pretty good understanding of why they did. 

A study of **fulse positives" would be particularly interesting, for the 
Vi^hi fit mij,'ht sheii on education at the college. One always hopes, of 
rourse- and with little doubt that the hope will be realized — that the 
predictions will not be loo guxl. 

Whc'it would a true clinical approach, in our situation, entail? At the 
least, it would seem, 6 or 8 hours of interviewing and testing, with a 
sample of entering freshmen, in order to arrive at a *'dynatnic formu- 
lation" of eaeh case. Students would undoubtedly be changed by this 
clinical work, (juite possibly in a way that would make them less likely to 
drop out of college. Pred»' tion would have to take this circumstance 
into acrount. This whole business could, quite conceivably, be put into 
an actuarial table or equation; but this would be useful, of course, only 
in situations where this same prognim was in effect. 

V s iiw, time ofjc had completed this clinical work, he would very 
probably have lost interfst in whether his subjects dropped out of this 
roHe^e or not Other, broader, aspects of their future lives would, by 
and larg(», b<* se(»n as much nion* important. I assume that no clinician 
would undertake such a project ju -i to see whifther he could predict as 
well as the stat istirinns. 

At tin* 'onclusion of this work the clinician might understand the 
student wi*ll enough so tli;jt lie vtnxUl explain some of his processes to 
hifU if thiN nef^med to Im* in the student's interest. (And this, by the 
way. is a pretty g(K)d test of one's understanding of another person. It 
is ihv kirid of knowledge which wh(»n verified and generalized is of the 
vrry essi-nre of tlu' psychology of personality.) 

Aiid ihe eliriic'iaii might feel an ethical obligation to pass some of his 
knowlrdgi' ahujg to the student. No one has a better right to it. This 
niay well play tiob with the prediction concerning withdrawal. But, on 
the other hand the clinician would have the advantage of what might 
well he llie best predictor of all, that is, what the student says to a 
trust(»<l counselor — he is going to do. 

Such a clinical approach might be of very corisiderable practical value, 
assuming that the concern is with education and welfare and not merely 
with drop-oiits. The college roul i learn something about itself; and the 
student's derision to withdraw or not might be of a more considered 
kind. 



TESTING PROBLEMS 



97 



Consider in this connection the Tavistock f istitute's work with Com- 
pany X. Three clinicians and tfiree ofticials of the cor/ipany observed, 
in a ;:r(>n[> discussion siluation. applicants for pdsiti^ns; then they 
rlividj'd th»* irilJTvii!v\inj{ so tfiat caclj appiiranl v^as seen hy hvo clini- 
cians and two oflicials including the head of p ^rlment in wiiich 
trivj'n applicant would work, and it was later decidril in corjfjTcriCe ^.vho 
was to he srlectrd. A poor way. it would seem, to determine what kind 
of rn;tfi rnadr ^'(hhI ;it (ItMUpany X. fjut an excellent way, as it turne<l 
oi!t. to reducj* turnover of fiij^ddy trained persoruiel and t(j improve 
inonde lit tin* roniparrv. 'I'fie ollicials wen* leaiwin^' (juile a lot about 
♦ hrrnst-K» s arirl f{uip' ?! Jot of psycfiolo^y. 

If thf rliriiriiui wm* to put in 6 or 8 hours per subject, in the above 
proj(M t. It wou!d hr surprising' if he did not end up in a counseling 
relationsfiip witfi s»>rne of tfn-m. making referrals in other cases. 

As a (onnsi'lor or psychotfieraf)ist he would assume, with Kluckhohn 
and Murray (1) that airU client or patient was "in certain respects 
i\) like .'dl otluT rnen. b) like some otfier men, c) like no other man." 
Mr would lr;in most firavily tjpori g<*neral laws of human functioning, 
fiowr\rr (TU(h ly tficsc wcH* formulated, and next most heavily upon 
his ((.nreptinns of syndromes, patterns or types tfiat were more or less 
comnion. lie \\ouM try to order to these general laws and conceptions 
tlie gerierali/atioris fu' would have to make about tfie uni(iue pr oductions 
<;f [lis clii'ut. 

lie would expect to receiv relatively little help from tests. Of course, 
there :ir»' no objective tests for more than a small fraction of the vari- 
af>les Aitfi wfiii h fie would fiave to deal. He would find other pef>ple\s 
formidj.f ions on thj* b.jsis of projective tests interesting, but not a very 
pra<'li<al iiivestrtHTit of time ;)nd energy; fie would ]iave to make fiis 
own forrnuliitions in ;in> c;ise. nrid fje would probably consider that fie 
was in a l>ett<T po-=*if.[i to do this tfian was tfie projective tester. lie 
woeild re^'i'rd ernpii . derived predictors of success in psychcjtfierapy 
as a nsefid cfie. ], upon liis own jr ' -)t. lie could not take them too 
s«Tir>usl>. (orisiilerint: as fie won! i- vss in psycfiotherapy is still 

undelined and that tfie predii tors p - did not apply to his psycfio- 
therapy anywav, ffj* vni\U\ fiavi* no grejit diHiculty— assuming good 
training' in recoL'ni/irii: [.mi fie fiad a tougli case on fiis hands, but 
numerous offier • rinsich'rat i.^ns migfit outweigfi tfie dubious progrujsis 
in fiis decision to try and fielp tise person. 

In surn. 1 serrn to have come out in favor of interaction between 
rlinical and statistical metfii.ds. on tfie grounds tliat each can f)e sup- 
portivj* of tfie otfier. Tfiey need f)e (omj)etilive l)ut rarely, sincj' for a 
given task one or tl otfier l ari usually be judged to be better suited. 



98 



1955 INVITATIONAL CONFERENCE 



In scientific work, the major rok of statistical prediction is still to 
(Icrtionstrate what has been observed in clinical or experimental situ- 
Hlioijs. It has beefi suf^g^sted here, however, that the development of 
empiricjil predictors of so< i;illy defined criteria mny also have an im- 
portant exploratory function; it may siif^gest hypotheses of general 
scientific interest. The fact that so much efl'ort is directed to *he statis- 
tical pre<iirtion of criteria which are socially important but scientifically 
dubious is hardly a criticism of the method itself. It is up to those who 
are primarily interehled in personality functioning to dt (i ' and to 
estimate ♦he variables which for them are of fundamentai . .^ortance, 
so that the great potency of statistical prediction may be directed 
to these. 

Whefj it comes to practical work the thing to eixiphasize is the nuge 
gap betweefi what is known or ran be known from actuarial prediction 
and what needs to be known and considered in order to determine wise 
polic y. In rlinical work with iiidividnals the inatt<'r is t\uhe dear. The 
clinician should be •^ralrfci for whalev*T objective lest results that can, 
without too nujch expmse. be placed in his hands, but he can do no 
more than consider thes** in their phice amoM*; n host of other things 
which he nuist judge and a<-l upon. Mailers are not very dilTerent from 
this in th(^ case of apf)licalions within an institutional setting. One 
might hope that here too Ihe psychologist will take part in the analyzing 
and the judging of the whole complex of affairs within which his pre- 
<lictive devi<-es have a reh /fuit place. 

MKKKHK.NCIvS 

1. K f.ec KitJ)MN. \\r> Ml mhvy. It. .\. Per.vmnlHy in ynltin\ SoriHv. and Cullurr. 
Nrw ^ork: Knopf. lOUI. 

2. Mkkhi., p. \\, (Iwtnil vs. Shi!is(tral iVtutiriwu. MiniH'Mpolis: tfniv. of .Minn. 
IVi'ss. I').->4. 



TESTING PROBLEMS 99 

Clinical Versus Actuaiiai Prediction 



CHARLES C. McARTHUR 



Our question is. •'Which predicts better, ciif)ical or actuarial methods?" 
The correct answer is. *'We don't know ; no one has done the experiment.'* 
The moral is. "Somebody ought to!" 

I know there have been experiments purporting to answer this ({i^es- 
tion. They just seem for the most part so poorly designed that tliev are 
irrelevant. 

Ifow should a relevant experiment be designed? Well, thr " 
rule is that both clifiicinn and actuary should be given everv 
tunity to show their wares. That's the only possible way to u 
them. If. with apparefit sciefitific sophistication, you hoiii »h'» 
ditions under which the actuary and the clinician must prrtVu i . in- 
stant." one or both men will be handicapped by the ^ iKiitions you 
prescribe. How can you say "how much" of a banc' 7i man 
sulTers. or how to make his oppofienfs handicap ' eciu?.!";' 

Years ago. we had a similar problem in intelligence ti -i^ug. In the 
early days of the testing movement, it was thought thj'.i the way to 
\hi "fair" to all the childrefi tested was to repeat precisely the same 
external ritual, with the examiner working in the same oflice. with the 
same lighting. intro<iucing himself and the test in the same way and 
giving instructions verbatim. The I.Q.'s so derived were presumed to 
be directly com.parable. Alas, they were not. What was soon learned 
about intelligence testing was that true comparability could be had 
only when tiie examiner varied his behavior appropriately from child 
to child, so as to obtain f*)r each child the maximally favorable con- 
ditions »ien each child was "given all the breaks," the resulting 
I.Q/s could be justly compared. 

So it is with our question. If we really want to know how the clinician 
and the actuary compare, we have to let eacrh man 
(a) use the data of his choicer, 
fbj make the analysis of his choice, and 
(c) make the predictinns of his choice. 
Now. I'm not an actuary and I'm speakiri^, to actuaries, so I'd betler 
let them make their own choic(» of data, anjilyses and predictions. It is 
about the clinical half of this contest that I feel entitled to speak, if 
only because, at the Study of Adult Development, wo have recently 



^6 



ERIC 



1.0 I 



I.I 



1.25 



20 



1.4 



1.6 



100 



1955 INVITATIONAL CONFERENCE 



gathered some experimental observations on the way people gather 
good and bad clinical predictions. I would like therefore to review the 
proper choices of data, analysis and predictions for the clinical half of 
a good clinical versus actuarial experiment. 

m m m 

If we want to make good clinical predictions, what kinds of data''' will 
be the data of our choice? 

We want plentiful data. Plentiful enough to make us feel that we may 
have an adequate Scirnple of all kinds of our subject*s behavior. Those 
of us who earn a living as working clinicians ^o day after day, year 
after year, jumping to premature conclusions on inadequate evidence. 
ThatV what we're paid to do. All the same, when we do experiments 
for the advafieement of knowledge, we are forced to accept the stern 
reminder of HoU-Tt White that "An attempt to cut the testing schedule 
below ten to fifteen hours with each subjwt is merely a proposal io 
sabotage the research." A more usual battery for experimental purposes 
would run to thirty or forty hours. 

We want various data. It is almost indis[)«*nsable to wateh one subject 
interacting with at least half a dozen examiners. It is indispensiible to 
sample his Miavior at all psychic "levels." Projective devices are a 
mu.st but so are observations of S. performing workaday acts in his 
everyday M'tting. 

We want overlapping data. We'd best see our man tackling com- 
parable problems under very dilFerent conditions, with difTerent ex- 
aminers, difTerent degrees tjf stres*. in dillerent contexts. 

We want open-ended data. The ratio of the subject's talk to the 
examiner's talk should bo at least ten to one. 

We want fully recorded data. That is anotlier hrsson from intelligence 
lifting. More and more, the by-produrts of a U*si situation turn out 
to be more useful tha.i the measurement that was the historical purpose 
of the test. A Ww*hsler-Ii*»llevue recorded verbatim, the irrelevant 
remarks being recorded most scrupulou.sly of all, tells us many times 
as much as the I.Q. or even the sub-test profile of scores. White has 
made this point well, ut the same time explaining why we insist on 

t QuiU* typicnl of thi» problniw of rornrniini^ Utiui acr.w iUa two frames of reference, 
clinical and artuHriH*. : : fuel that th Uilk was prrfuircd with no awareneiM 
tlmt th>' word "dutii" cntitiiitird any 'tnihi^iiity. At hiticluHtn U^fore the patiA 
liiM'tUMiDri. th»* wriU'r iM'Ciirnt' awHnr ii^ni whi'ii h«^ hu^h "dutn" he nipariH "c<>ri> 
terilH of vi^rlMil i>r Urhavioral arU" and that whrn acttiaricH say "data" th«y mean 
"HCoreH." Neither u^aife in p*'rf*H:tly rxchwive hut the differencf is grona enough 
t4> tanpi** commumcation badly. 



iJ7 



TP:STlN<i PROBLEMS 



101 



obtaining overlapping data. *'Our problem-solving lest/* he points out, 
"will perhaps also be a le^;t of frustration tolerance, a tt^t of control 
over anxiety, a test of !evel of aspiration, or a situation that happens 
to mobilize an infant traumata, and our report on ils results must include 
as much of this information as can be observed.'' The rhetorical italics 
are mine. 



If we have now collect«xi thf right kind of <'atL\ we may be in a posi- 
tion to take our second step and ask what should be the clinical analysis 
of our choice. .\rid th« is one best analysis. I, too, have heard the 
rumor that each cliiiitian uses the method that is his personal favorite. 
The rumor may even Ix' tru<?; tastes var>', though science be constant. 
Nonetheless, both logic and empirical validation identify one best 
techni<jue of clinical analysis. This technique is neither intuitive* as 
rumor so often has it. nor a Mystery, nor is it unavailable to actuaries. 
You see, the clinical analysis of choice is nothing but the application 
of the S< ientilic Meth<xl! 

I am not the only clinician who has this id<'a. "The diagnosis of e?xh 
perstMiality is." according to White, **a miniature scientific experiment." 
Meehl would aiso, although with some caution, accept the idea that 
the g'Kxl clinician makes " 'little special theories* the applicability of 
which is to one person." 

Nor am I without evident At the Study of Adult Development, we 
re(*;ntly asked a scries of clinicians to formulate rich case data that was 
ten years old, and then to make postdictions of the sul>j<x!t*s behavior 
during the ten v*.'ars since the last recordwl <Mitry. We knew what the 
subject had l>een doing these last ten y<»ars: and, while our clinical 
prophet tried to guess what had happene<l. we all sningly sat around 
in ( iri'le, "holding the book on him.** 

What all our clinical prophets did under these very trying vah'dation 
conditions seemeci to be to build from the data a clinical construct, a 
conceptual device, a "special tht^iry applicable to one person,** a model 
of that person, that ma<le this statement on page 17 of the record con- 
sistent with that remarkable <jnotation back on page 14. l^ach datum 
became grist from which was ground a formulation of tlu' premises 
governing all of S.'s b<'havior, tho lifelong premises, the treasureil self- 
consistencies with which the person being studied had learned to face 
the world. Each batch of data lent itself to hypotheses about the person, 
hypothes<5s that could b<' chwlt^l out against new data as the record 
progressed and could be revised with each successive cross-validation 
provide<l by turning another page. After conning all the data, the 




10?. 



1955 INVITATIONAL CONFERENCE 



clinician possessed a fuzzy, but gradually sharpening, coriceplualizaliori 
of the man under study. "He seems to I)e thf sort of a person who . . 
Then the cliriiriari could make his prfxliclions by doing imaginary ex- 
perimeriUi with the rriod<»l. There would he paths down wliich the con- 
ceplualiz«*d person could efTortlirssly stroll, while there were alleys into 
which he simply could not be made to turn. And that was how good 
pretlictions got gt'neratt'd. 

Tomkins has rigidly forniulaled this lechriique. Perhaps his most 
important statement is that we have to derive from the data itself Ihe 
very categories in which Uuil data will Ije cast, "In general," he warns us, 
*'we do not know »'xactly what to look for« If we prejudge the categories 
of analysis, w«? may conuiiit serious errors. What check, then," he asks, 
"have we on ihr adequacy of our selection of categories of analysis? 
It is our conviction that the logic of the individual's fantasy itself must 
hi' our ultima le criterion/' End of quote. I would agree unreservedly. 
That facility at inducti<»r. which enables him to derive for each new 
person studied a fresh srt of cal:'gories that maximize* the patterning 
of this particular M't of data is thr very hallmark of the g<KxI clinician. 

Tomkins go«*s on. in a chapter that shoidd be required reading for all 
graduate .students, to sperify h(»w one can deduce from the data what 
catefj;ori»»s wtTC implicit in the mind of the subject himself. Nor are 
Tomkin.s instructions va^Mie or dc'pc»ndeul upon intuition; he uses as 
his tool John Stuart Mill's caneiis of h)gic! Mill set^s down rules like, 
"If two or more instancies of the phenomenon under invri^tigation have 
only one circumstance in common, the circumstiince in which alone all 
the iiLstances agrre is the cansr, or rlfect of the given phencjmenon/' Or 
els4», "If an instanrt* in which the phenomenon under investigation 
occurs, and an instance in which it does not occur, have every circum- 
stanct in comm(»n. sn\r one. that one occurring only in the former, the 
circumstance in which alone the two in.stances differ is the effect, or 
cuus^', or an indispens^iblc part of the cau.se of the phenomenon/* And 
HO forth. (h)wn through the Joint Metho<l of Agreement and Difference, 
the Mf'th<Hl of ConromitiMit Variation, N<»cessnry Causes, Suflicient 
Cimses. etc. 

Furiously pedantic as they may sound. Mill's canons work. Suppose, 
for inst4inc*», your man t*»lls one T.A/FV story in which tiie boy's mother 
wants him to practice the violin hut the bfjy rebels, afterward feeUng 
very guilty. Suppose in another st(»ry the mot ler wants thr son to go 
to scIkkiI but the Imy <juils school, again fueling guilty. Suppose a ihird 
story tells of anotluT rebellious son who is now, however, being recon- 
ciled to his mother and doing as she wishes, and consequently this son 



TESTING PROBLEMS 



103 



becomes a ^rreat success and feels ver\- happy. Mill and Tomkiiis would 
have us infer that, for this narrator, only a hero who does as his mother 
wishes rnay 1k» permitted a happy ending. The category ''obedient 
SOILS ' will therefore play a dynamic part in our formulation of this 
man's pers()nality and hence in .re can predict about his reactions 
to future evrnts. For some other man whom we might study, this 
category could have absolutely no meaning. 

I realize that clinicians don't usually think that systematically. The 
best empirical evidence about how clinicians actually think in practice 
is provided by Shneidman. After reviewing fifteen shockingly different 
systems for interpreting the same T,.\.T. protocol, each system being 
olFeretl by an "authority," Shneidnian is able to discern a common set 
of sti-ps in the clinical analyses. For most workers, the initial step is 
Charcot's: to Unyk and look and look. They read and reread the data. 
Thi' next stage seems to be ".semi-<jrganized notes" on repetitive or 
logically consistent patterns in the data. Then the criterion of internal 
consLstency i.** applied and re-applied to trial hypotheses about the 
structure of the person's motives. Only in the end, when a diagnostic 
label is sought or if some one datum sticks out as incongruent with the 
rest, is any general psychological theory invoked. That was also true 
of our Study of Adult Development cliniciaiis: theory came last. The 
one discussant we had who was ernbarrassingly ifiaccurate tried to 
deduce the beha^.ior of the man he was discussing directly from the 
postulates of a general psychological theory. The successful prophets 
were those who remained inductive. None seemed to be as systematic 
as Tomkins would have them; hut that only prov»»s that the methods 
of analysis we use in everyday practice arr' less than ideal. It is probably 
true that we all could profit from more seminars entitled "The Diagnosis 
of Personality as an 1 1 ypothetico-Deductivi? Process." 

* * * 

Coming to the third portion of our clinical versus actuarial experi- 
ment, Tornkins' logic calls our attention to what sort of predictions 
clinicians shfMild ch<H)se to make. If the categorii s in which we cast our 
data were those that logically arose from the d;ila itself, ihvn we have 
already decided what aspects of a particular case we can categorize, 
and hence we have unintentionally decided which aspects of the case 
we can predict. The clinical analysis has botti this virtue and this 
liability: that it predicts wli if will be predictable . . . and what won't. 
Ail allii^l technique, u.su. . .lied "thematic" analy.sis, has this same 
property. There are cert. Mil themes in whic'.i S. is very emotionally 
involved anil it is these matters that we have most data on and can 



J 00 



1955 INVITATIONAL CONFERENCE 



best predict. We huve no basis for tr>'ing to predict any and all aspects 
of S.'s behavior. 

'lomkifis' forffiulatidii also gives us a second rule about our predic- 
tions: Ihry rmist hv contingent predictions. What we know is what 
IVirnkins calls "Ihv conditions for ..." a certain behavior's appearing. 
'*If S. p»Tccivcs his boss as a nurturant elder, he will react by being 
ungrateful.'* If nut, something else will happen. **If S. sees a woman as 
a sexual object, he will assume her to be evil, but if he perceives her as a 
supptjrtive mother figure, he will assume her to be good.*' .\lways our 
predictions have the form **If . . . then . . .** 

It follows that the usual ♦ xperimental demand that the clinician 
predict multiple choice criteria, which look nicely objective but never 
state contingencies, almost certainly dooms the clinician to failure. It 
just isn't possible to say, in general, that "S. will be very aggressive." 
It is possible to say, *'lf S. sees the situation in this or that way, he'll 
he very aggress'.v*?.*' It is absurd to say, **S. will get well." It makes 
sense to say, **If his therapist can play this or that role toward S., 
S. will n»spof>d beautifully." Indeed, I wonder if it isn't more important 
to make these contingent predictions, not only because they turn out 
to be right more oftrn. hut also because they have more practical value. 

Clinicians themselves don't seem to be aware of what predictions 
they can and can't make. Tinje after time, an excellent clinical analysis 
gets reported in t«'rms of a rating scale, and so dooms itself to being 
rnvalidattfl on follow-up. When are we going to learn that we can't 
say **Mr. \. will be more aggressive than Mr. B." without specifying 
the conditions? Not only do our available methods of analysis prevent 
this, but it is quite likely that people just aren't made that way. Some 
of the most f'uiuius and recent and spectacular failures of really good 
( linical studit^s to stand up under cross-validation have arisen because 
of this one «Tror. 

I would insist, then, that any valid estimate of the accuracy of 
i linical prediction must permit the clinician to make contingent pre- 
dictions and to limit himself to predictions about topics of his choice. 
I would hasten to add, however, that giving the clinician this liberty 
will not result in trivial, superficial, or safe and sure generalizations. 
The pn^lictions the clinician can make relate to those very behaviors 
that have most imfKirtance of all, because they are the behaviors that 
matter to the subji»ct himself. 

« ♦ ♦ 

S<» now we have n? viewed three sets of conditions: the proper data, 
i\w. proper analysis, and the proper predictior»s that must be had if we 



TESTING PR0BLH:MS 



105 



are to learn whether the rliniciaii be a prophet or a charlatan. Perhaps 
you see why I feel that nont* of the studies done till now are very relevant. 

There are two sets of such studies. 

Meehl has judiciously reviewed a set of studies that hold data and 
predictions constant while comparing two forms of analysis, one ac- 
tuarial, one what the experimenters call "clinical.'' Perhaps the best 
known of these experiments is Sarbin's, though Meehl has located a 
dozen and a half more. The nature of the analysis is always iiisufliciently 
speciGed, but the pieceni«;al data supplied as a basis for prophecy always 
seems to preclude the use of a truly clinical analysis. Sarbin, who did 
better than some of the oth«;rs, provided his prophets only with high 
school rank in class, aptitude test scores, a preliminary interviewer's 
notes, and a paper and pencil personality inventory. Apparently no 
one, save the "preliminary interviewer/' who left only "notes, " had 
l(K>kixl at the persofi in action. From such straws the clinician was 
asked to make bricks! That the clinicians in this study did as well as 
the actuariiis is irrelevant; what they had to he doing, with such non- 
clinical data, was what Sarhin accuses them of doing: they were manag- 
ing somehow to function as a human substitute for an I.B.M. machine. 
Almost all the other studies supply non-iliiiical data; all demand 
multiple choice, nonK:ontingent predictions. 

A second group of studies inchidt^ recent large-scale follow-ups on 
assessment batteries, such as Murray's O.S.S. program, the Kelly and 
Fiake studit^ at Mi<'higan on predicting the success of clinical psy- 
chologists, and the California studies of personality that are begiruiing 
to be publish(»d. Noni» of th<se has siipgeste<i any great validity for 
the clinical method. We have to take these failures of the clinical 
methfxl more seriously: they were d<»signe<i by good <linicians and used 
excellent clinical data. One pn»sumes that proper < liiii<'al analysis got 
applies], though this is not always clear from the publishi^ci accounts. 
What vitiati^s all these stuf!i*»s. howevfT. is their failure, in two senses, 
to maki^ clini< al pn»<ii< ti<»ns. First, there seems to l)e little or no con- 
tingent predication. W<jrse, nearly all the predictions Uike the H^rm of 
rating scales. That decision in designing tht^se stu<iit?s determined the 
nature of the findings. 

* * m 

So we still dofft know the answer to the main question before us. 

Only a study under proper conditions will be conclusive. If clinical 
predictions under ideal conditions fail to conn^ true, running the ac- 
tuarial half of the experiment will hanily be requiredl I happen to 



102 



106 1955 INVITATIONAL CONFERENCE 

believe, however, that clinical predictions, as operationally defined in 
this paper, will turn out to be 100% true; 100%, that is, less only the 
sampling error that is inevitable because we see 10 hours and not 40 
years of our subject's behavior, and less the error arising from unre- 
liability of those who observe both the independent variables and the 
criterion variable. 

That's my null hypoth«^is. I, like all my feHow clinicians, am eager 
to see the hypothesis tested. 

1. Mkkiii.. p. K. Clinical vs. Stat islical Prediction, MiniicaiMjIis: IJiiiv. of Miimes^jta 
Press. VroX 

2. SiiNEii>MA.N. S. Thematic Test Analysis. .\. \.: <.iruiu! & .Stnitloiu 1951. 

3. ToMKi.Ns. S S. The Thematic Apperception Tes(. V.: Grune Struttoii. 1947. 

4. Whitk. \N . "Whul is ti'st<'ij by psychiilo^ficul test^?" In UtKrii, V. \L am> 
ZntiN. J. fielation of PsychAogiral Tests to Psychiatrv. \. Y.: (Iruiu^ vt .^tratU>ii, 
1951. pp. .M i. 



I u:j 



TESTING PROBLEMS 107 

Clinical vs. Actuarial Prediction: 
A Pseudo-problem 



JOSEPH ZUBIN 



There are three possible ways of dealing with the problem presented 
by the title of this paper: (I) adopt the clinical point of view (2) adopt 
the actuarial point of view or (3) declare the dilemma U) be non-existent. 
The latter course is the one I have chosen and as a result I expect to 
get the brickbats from both sides. Clinicians may accuse me of 'leaviiig 
the field" because of my inability to cope with the dUemma, while 
actuarians may regard my approach as merely probing the null hypothe- 
sis. I feel, however, that the dilemma is in reality a pseudo-dilemma 
created by the hopefully temporary gap that now separates the clinician 
from the research worker. 

The reason for my position becomes quite clear in retrospect. I began 
my career in psychology with a statistical net to bag the elusive dif- 
ferences that may exist between abnormals and normals. Disappoint- 
ment in this undertaking turned me to the study of the individual case. 
As a result I began to realize that both sides of the coin— the actuarial 
and the clinical— belong to each other in an inextricable manner. It 
was not, however, until I began to study the philosophy of science that 
I could logically resolve the opposition between the two approaches.! 

Scientific methcKl is characterized by a continued interaction between 
observation and schemalization (I). Which came*' first is difRcult to 
determine. Primitive mans obstTvation of nature soon led him to notice 
certain regularities which he schematized into expectation or hypothesis 
as we now call it. These hunches, Irypotheses or discoveries, if you will, 
constitute the first step— the context of discovery according to Heichen- 
bach (9). This step might be likened to the stormiiig of a beachhead in 
the continuing war between science and ignorance. The second step is 
to verify the hypothesis. This leads us to the context of justification 
which might be likened to the establishing of law and order in the 
territory which the beachhead opened up. No amount of beach-storming, 

1 1 owe much of thw inwRht Dr. KiiRmie I. Burdock and to Dr. Raymond J. 
McCall. former j*tudrnU who guided my reluctant Rteps. 



J 04 



108 



1955 INVITATIONAL CONFERENCE 



however, can conquer a territory, and fio amount of empty drill can lead 
to victory. It is the sequential interaction between the two contexts 
that leads to su cess. The cliniciun, on the one hand, often becomes lost 
in the land of discovery, narcissist icatly enjoying every new idea, 
smelling every new hunch arid titillated by every new possibility but 
only too rarely, if ever, leaving the context of di. overy for the context 
of verification. The actuarian, on the other hand, often becomes lost 
among his equations, gadgets and techniques, sharpening and polishing 
under the assumption that the sharp«»r the t(x>l, the better the eventual 
results. But, for much of our work our tools are already too fine. Most 
of the concepts which we deal with clinically are too open, too crude to 
warrant even the .01 level of confidence on the score of either type of 
inference ern^r (Type I or Typ*? II). But psychology is not aloi;5i in 
this Gx. FIven biology, a science supposedly higher in the hierarchy of 
exactness, suffers from loosely defined concepts which nevertheless do 
not prevent scientific progress. Julian Huxley (4), in defining the con- 
cept of species, says: 

"flowever. we must remember that species and other taxo- 
nomic categories may he of very different type and significance 
in difl'erent groups; and also that there is no single criterion of 
species. Morphological difference; failure to interbreed; in- 
fertility of offspring; ecological, geographical, or genetical dis- 
tinctness—all those must he taken into ficcount, but none of 
them singly is decisive. Failure to interbreed or to produce 
fertile offspring is tlie nearest approach to a positive criterion. 
It is, however, fneaningless in apogamous forms, and as a 
ni'gative rritrrion it is not applicable, many obviously dis- 
tinct spt'cics. especially of plants, yielding fertile offspring, 
often with free Mendeli.m reconihinalion on crossing. A com- 
bination of rrileria is needed, together with some sort of flair. 
With tlie iiid of thrse, it is renuirkahl** liow the variety of 
<»rganic life falls apart into biologirally discontinuous groups. 
In the great majority of rases species ran be readily delimited, 
and appear as natural entities, not merely convenient fictions 
of the hnrnan intt'lleet. Whenever int(^nsive analysis has been 
applit d. it on the whole, ronfirnis the judgments of classical 
taxonomy." 

It is thus not tin* precision of tla» ronrept, hut its power in e.xplaining 
behavior whirh differentiates the good from the poor concepts (5). The 
clinician who enchants himself with the brilliance of his discoveries and 
hunches as well as the actuary who ipi'nds his time putting a keener edge 
on his t<M)ls and proudly contem[>leies their sla^en are fanatics who have 

''Jo 



TESTING PROBLEMS 



109 



"redoubled their energies when they lost their goal/' For the goal, after 
all, is the verifiable understanding and prediction of human behavior 
and to achieve this goal, the observations of the clinicians and his 
hunches as well as the verification of these hunches by the actuary' are 
essential. 

From this point of view, the question of whether the actuarial ap- 
proach is superior to the clinical is tantamount to asking whether the 
sperm is more important than the ovum. Both are equally important 
and no progress can be made with one alone. In fact, exercising one 
alone in isolation from the other is a rather unproductive form of activity 
despite the satisfaction it may afford. 

The better the hunches, the more effective will be the actuarial pre- 
diction, once the hunches are verified. To compare, clinical impression- 
istic prognosis with the actuarial prognosis derived from a previously 
formulated clinical hunch is a travesty! How could a new untried 
clinical impression ever equal the statistically verified residue of earlier 
clinical impressions. We should have been so certain of our actuarial 
techniques that nothing but a complete victory in every precinct should 
liave satisfied. WTiy did the results of the 24 studies (8) fail to show an 
advantage in each instance. The answer lies in the relative rigor or 
l(K)sencss of the criteria. When rigorous, specific and specified criteria 
are available, one can always build tests which will prognosticate 
.successfully. As the criteria become looser and less explicit it is de- 
batable whether either method, actuarial or clinical, can accomplish 
much. Prognoses of mental illness, for example, should be based on a 
spt^ified follow-up period since outcome varies with period of fo!low-up. 
If the actuarial formula is based on immediate outcome as a criterion 
while thii clinical prognosis is based on eventual outcome, it is no wonder 
that the actuarial method is superior when the results are evaluated 
against immediate outcome, f 



t\M>i!t' l*n»ft's>M)r Mrehl dUi not read his ciiwiissioii for lack of tinip. the few rr- 
fimiks im nmdv \vd inc» to make the following comments in f)rder to clarify onr diffj'r- 
inK JH»int4< of view: I had antiripale<l hrickhats from the right and from the left, 
hilt not fn.ni the eenU-r. Despite l»nnr8 very thoughtful book (8). the distinction 
iH lweeM at tuarial anil clinical prediction is heuristic rather than basic. The process 
of prediction for a group is quit<^ different from prediction for an individual, l lif 
foiint r Clin l)e completely actuarial as in life expectancy tallies; the latter by its 
very nature must b#? clinical if it is to n'sult in action. A distinction needs to he 
made l)etween a firediciion and a decision based on that prediction. The predictirni 
might U' that there is a .70 probability of success. What one does on the basis of 
such a probability in the case <jf a single individual is Inist exemplified by what one 
d(K's for himMflJf when faced with such a prediction. In the last analysis, decision 
is a "clinichr' act. not an ' actuarial** one. To have one standard in mind when one 
makes d**<Msions about his own fate, and another standard when one makts decisions 
f*»r a patient is the 'double standard" at its worst. No one would sidect a 5*?cretary 
or a wife on tes! scores alone, even if the multiple r were as high >-: .80 (which it 



110 



1955 INVITATIONAL CONFERENCE 



rarely is in any precliction studies I have seen). WTiy should one be willing to decide 
on a patient's therapy on actuarial ffrounda aloneP Mind you, I am not arguing 
against utilizing regression equations for prediction; but I am concerned with what 
you do as a oons^ quenne of the prediction. When actuarial predictions succeed in 
encompassing 90% of the variance in the behavicjr under observation, we can safely 
leave prediction to a statistical clork and save the clinician's time for the more 
ardufius task of th«Tapy. Since most actuarial predictions account for less than half 
of thf variance in the observed behavior, actions based on such predictions need 
the integrating act uf th»^ cliiiirul decision. 

V\hiMi the clinician makes a pntdictioii, looks up tables of dosages of drugs, con- 
templaLes syir.dromirs of symptoms, he is engaged m statistical or actuarial activity. 
What he d»>es with this iiiffirmation— his volitional decision— is a ''clinical " act. 

When the statistician chcwses an experimental draign, selects a technique or 
drcidfs on the relative weights to l>e assigned to certain factors, he is acting clinically. 
Mis subsequent anatysis and the prcdictnms derived from probability consideratioiiS 
an?, of course, actuarial. 

Tb^- compleb; proci^ss by which a dccisi(»n is reached with or with<iut the help 
of VtiiU's ciecisi<jn functions is a volitional act which bos been described intro- 
MfMCtivrly by Ach (Ach. \. ••Anulys4» des Willens". Handbuch der binlogischen 
ArbriLsinrthoden, Abt. VI. Tt?il K., Berlin. Urban and Schwarzeiiberg, 1935*. 

A<Ti»nliii^' to Acb. man is wwr cIo.s»t to his inner self than when he makes a 
viilitioiial (IrcisifMi. Krecddin of the will, apparent or real, underlies this dccision- 
fnnkinfj process, and is tbc very cssenrj- nf mental life. To maintain that in our 
present stjitf? of if^noraiK'c. we eaii snhstitntc a rcf;n»ssi()n eqnati»in for the volitional 
art Would be flying in tfir face of reality. Derision belongs to the context of dis- 
cijvery. a land whdse rules and n»f^nlatii)ns are as yet unknown. 

* The author trai»slatcd this bo<»k into l''rif,'lish some ten y»;ars ag») and severd 
rarbiiM ivipies are availahle on loan. 

i\r've'rlbtrb»ss, it is important to call the attfritiori of clinicians to the 
fart thiit tbe?y \u\\r spent too much tirno in '*hiinch-larid'* and not 
cnou^'b in the la^jd of verification. By the same token it is important 
to indicate to the statisticiiin that the assumption.s of normality, linearity, 
continuity, homoschednsticity. etc., etc. which underlie many of his 
lerhnie|ue?s incbulin^: the multiple regression equation, discriminant 
fnncti(»n.s as well as factor analysis, are not suitable for the non-linear, 
discontinuous, nnit-icss type of observation which the clinician deals 
with, helweeri the land of di.scovery and the land of verification a 
hridp* must be built, <-on .isting of the proper techniques to meet the 
( linical needs. C.liniral psyeholofry today is in about the same position 
that a^^riculture was before Fisher or physi<'s was before Newton. Just 
as FLsher had to drvelop te»< hniques for dealing' with the Inniches eman- 
ating from !h«» pra<'li<al agronomist, .so a new Fisher is required to 
deveb)p techni<|ues for t*»sting the hunches emanating from the clinic. 
This new F isher will have to <'onvert our present group-centered tech- 
niques into individual-centered tools, will have to deal with syn- 
dronurs and patterns and profiles emanating not from data which satisfy 
the requirements of factor analysis, but from the crude amorphous 
qualitative <lata which defy factor analytic methods, or which are 
verily diHombowelwl by such high-powered techniques. 

1 07 



TESTING PROBLEMS 



111 



A good case in point is a rec ent study on the effects of drugs on psy- 
chological test function (7). In order to deterniine the edect of a new 
antihistamine on psychological test performance, the ffiect of the new 
drug was contrasted with the etTect of a placebo, a rttiniulant and a 
hypnotic drug. The psychological techniques consisted a group of 
conceptual, perceptual and psychomotor tasks aivd tii interview. The 
results of one of the tests, the critical flicker fusion test, will be sunicient 
to clarify the point at issue (3). The means of the group of 2 \ patients 
who participated in this experiment are shown in Table 1, 

TABLE 1 

The critical Flicker Fusion Threshold in cycles per second for the 
Various Chemical Agents (i\ = 21). 



AGENT 


Day 


Mean 




1 


30,8 


Stiinu'unt 


2 


31,1 


Aiilihist 


3 


30.7 


Siporific (low) 


% 


31. I 


\iitihist (hi^h) 




30.7 




(i 


31. 1 


llyprintic 




31.7 


PIhc»*Im> 


K 


31.3 



J 08 



112 



1955 INVITATIONAL CONFERENCE 



CHART 1 



The Critical Flicker Fusion Threshold in Cycles per 
Second for the Various Chemical Agents (N « 24) 



Placebo 



StiinuUnc 



Antlhlst 
(Low) 



Soporific 
(Low) 



Antihist 
(High) 



Soporific 
(High) 



Hypnotic 



PUcebo 

30.0 




31.0 

Cycles per Seccxid 



32.0 



TESTING PROBLEMS 



113 



The data were subjected to an fnalysis of variance the reaulU of 
which are shown in Table 2. 

TABLE 2 

Summary of the results of the total analysis of v ariance for lliree 
threshold detemii nations of C'.FF at ihret? levels of a',>pari?fit brightness 
at each of the two Ji^ht-dark nitifis for twenty two subjects over 





(1) 




(2) 


(3) 


(4) 


(5) 


(6) 










Mkan 


K 








r 

c 




S^i;ahIv-i 




278 
38572 


97 


0 50 
68 67 




39 85 
19286.33 


1 83 
IIM 42 


01 


:i " lti>truni<*iiU 


2190 


67 


3 90 


1 


21W,67 


13 45 




4 Itidividtuibt 


7202 


B5 


12.82 


:m 

14 


342 W 


15 78 


OP 


1 -2 
1 -3 

1 -4 

2 - 3 

2 ^ t 

3 - i 
1-2-3 

1- 2-1 

2- 3-1 
13-4 
1-2-3-1 


113. 


64 


0 21 


8 2w 


2 12 




110 


24 


0,20 




15,75 


2 08 


,05 


3193 


69 


5 69 


147 


21.73 


2 87 


or 


325 


73 


0 58 




162 K7 


30,4^1 


.01 


370 


03 


0 66 


42 


8 81 


1 65 


05* 


379 


38 


0 68 




18 07 


2 39 


.05* 


54 


i3 


0 10 


\x 


3 89 


1 69 




979 


70 


1 74 


2Vi 


3.33 


1 45 


01* 


224 


75 


0 40 


42 


5 . 35 


2 33 


01* 


iTTi 


29 


1 98 


147 


7 56 


3 29 


or 


675 


71 


1 20 


29^4 


2 30 






Within 


387 


t7 


0 69 


2112 


0 18 






ToUl 


56173 


21 


UH> 02 


3167 









It will noted that the ' lM»twi^'n-aK*Mit" variance was not sig- 
nificant when mmpartHj with the largest interaction term but the 
* U»twft»nHndividuaUvariance ' and its inU»ractions wort- statistically 
Higni(icnnt as shown by the starred F nitias, 

BecaiJS4' of the significance of the interindividual variance and i^s 
interactions, each indivi<Jual subjivt was treatetJ separately as an in- 
dependent univt rs**. Since 10 measures of critical flicker fusion Ihn^sliold 
were taken each day on each individual, an analysts of variance for the 
single individua. could be perfomiwL The results indicated that the 
group trt^atment of tlie data had hidden more than it revealed. The 
individual treatment of th«^ data in<Jicated that half of the group (II 
«ases) had n^mained unaffected by the chemical agents. In th<ise wh<i 
showed significant effects, the low soporific dosage showi-d a significantly 
improved perfonnarKe in 6 subjects and a significantly poorer per 
formance in two subjects. The higher dosage of be soporific agent 
improved the performance of 8 subjects and reduces the perforn. • e 
of 3, leaving the otlier 11 subjects unaffected. The rest of the uuu* 
are shown in Table 3, 



Ho 



1 i t 1955 INVITATIONAL CONFERENCE 

TABLE 3 

Number of subjects showing significant improvement or worsening; 



for each chemical agent 


on the critical Flicker Fusion Test (Strobo- 


Hcope). 








Chkmical Agent Impdovf.d 




UNAFFKCmO N 


Stimulant 


% 


5 


13 22 


AiitihijiUminr (luw) 


5 


4 


13 22 


Sjpurofic (low) 


#1 


*» 


14 22 


AiitihiBtamine (hi|(h) 


3 


M 


U 22 


Srjpurific (hiffh) 


8 


3 


11 22 


H ypnoitc 


6 


4 


12 22 


ToUl 


32 


2h 


:\ 132 


Averag« 


3 4 


4 3 


i:: 3 22 



III 



TESTING PROBLEMS 



CHART 3 

Number of Subjects Showing Signlflcanc Improvement or 
Worsening for each CheiTiical Agent on the Critical 
Flicker Fusion Test (Stroboscope) 

Improved ill Worsened I J Unaffected 



20 



15 



10 



13 




13 



14 



11 



11 



5l 
1^ 



o 



1^ 



O 

c 

ft 

X 



112 



116 



1955 INVITATIONAL CONFERENCE 



It is clear that the group of subjects was quite heterogeneous with 
respect to the effect of the various chemical agents. For this reason 
group statistics should always be examined in conjunction with indi- 
vidual statistics wherever possible. 

Just how a heterogeneous group can be subdivided into more homo- 
gen<'ous subgroups becomes an important question for the clinical- 
nrluarinl controversy. If we could find a technique for subdividing a 
group into liomogeriofius subgroups, we could then apply group statistics 
to the subgroups and avoid the impasse which occurred in the previous 
example. 

An example of th«i application of individual-centered techniques 
which keeps the sights of the experimenter focused on the individual 
instead of on the group is the technique of like-mindedness (10). Some 
20 years ago we faccnl the problem of developing a personality inventory 
which would Im» of help in classifying mental patients. This study was 
rt'jKjrled in part in 1937 but biKause of an error in computation lay 
nnconipleled until rt^ently when the error was discovered and the 
analysis completed. Wliile we have since given up the use of inventory 
iti»rns as the sole basis for classification, and have (we believe) found 
rnori' ixTlinent indicators, the method is general enough to be applicable 
to most of the data in the clinical field. 

Thf I*ers4»nalily Invt^ntory Form (6) which consisted of a distillate 
<tf 70 ilrms frf»m a matrix of 1000 found in other inventories and in case 
hislnrit^s. was administered to some 1000 patients of varying types of 
illnt^ss and to lOOO normal controls. In the process of s<»lecting the 70 
iff ins. only thos«* itt»ms were retained which differentiated the patients 
frniii the normals in all the age groupings, the two sex groups, and 
illiit^sH calegori<»s, since we wished to get a screening test which would 
srp;irale the ill from the well. In retrospect this seems to have been a 
inislake. In picking out only the items which differentiated, we selected 
ti I iiid»ilities of the patient group, and eliminatcfi their assets. Perhaps 
I hi' p.tM fiiirig of tlie ass<'ls and liabilities is a more useful basis for 
Hrn'i'iiing than the total rninilMT of liabilities alone. 

\ siirnple of 68 male schizophrenic patients and 68 normal controls 
niatrluHl for age, sex ami education was then obtained and by the use 
of IBM scoring niarhin<»s it was possible to obtain the agreement scores 
of each patient w ith each of his 67 colleagues and each of the 68 normal 
<oiitnils. Similarly the agreement scores for the normab were also 
obtaiiMHl. A sample of the agreement scores is shown in Table 4. 



113 



TESTING PROBLEMS 



117 



TABLE 4 

Agreement Scores between 5 individuals of the abnormal group on 

a test of 70 items. 

ItanvivvkLH 



Individcai^ 


\ 


B 


C 


D 


E 


A 




37 


S2 


48 


44 


B 


37 




47 


48 


50 


i: 


32 


47 




46 


46 


D 


48 


48 


46 




33 


K 


44 


50 


46 


53 





Tlie mean agreement scores are shown in Table 5 and Table 5A and 
Chart 3. 

TABLE 5 

Intragroup and Extragroup Agreement Scores for 68 Schizophrenic 
patients and 68 matched normal controls. 



lNTRAi;ROtrp 



Extragroup 



Scorkh 


NdRMAl. 


SCHI7.0PHRKN1C 


NoRMAI> 


ScmzoPiinErfu. 


48-30 


100 0 






100. 0 


45-47 


70 6 




100.0 


82.4 


42-44 


45 6 


100. 0 


97.1 


73.6 


39-41 


23.3 


80.9 


55.9 


53.0 


36-38 


11.7 


60.3 


32.4 


41.2 


33-35 


7 3 


32.4 


14.7 


26.5 


30-32 


2.9 


*>2 1 


8.8 


20.6 


27 - 29 


0 0 


10 3 


0.0 


8.8 


2'* - 26 




:» 9 




2.9 


21 - 23 




'1.5 




2.9 


18- 2t) 




IS 




0.0 


15- 17 




0.0 







TABLE 5A 

Intra-group agreement scores of 68 schizophrenics and 68 matched 
normal controls and extra-group agreement scores of 34 schizophrenics 
and 34 matchKl controls. 



Agrremfnt Scork« 



C^ROI.'P 

Nnrnial ('^utn>N 

Schisor^hrenioi 

DifTerrrir* 

P 





Intra Group 




Extra Group 


N 




N 


M 


a 


68 


44 6 4.69 


34 


40.2 


3.94 


68 


37.0 5 U 


34 


40.2 


7.31 




7-6 






3.37 




8 5 






3 34 




< 01 






<.0I 



118 1955 INVITATIONAL CONFERENCE 



CHART 3 

CumuIaUve per cent dlsrrlbutioa of Intn-group agreement 
scores of 68 schizofdirenics and 68 matched normal controls 
and of extra-group agreement scores of 34 schizophrenics 
and 34 matched normal controls. 

















































































•Si. 






















/ 




















< 


fj 
































































I 


r 

































15 
17 



18 
20 



21 
23 



24 
26 



27 
29 



30 33 
32 35 
Scores 



36 
38 



39 
41 



42 
44 



45 
47 



48 

50 



The 67 pairs of agreement scores for each pair of individuals were 
then correlated and the table of intercorrelations of these agreements 
scores were subjected to a factor analysis. Table 6 shows the inter- 
correlations. 



TABLES 

Condition l«tween agreemenl «ota for Donnak d Kliiophrem (Tie figure, abe the Ion? M 
for the normab, the figurts below are for the schizophrenics,) 





I 


1 




4 


5 


6 


7 


8 


Q 


10 


11 


12 


13 


14 


15 


16 


17 




A, 
8. 
C. 

T\ 
U. 

K 
I 
C. 
H. 

1. 
J. 
K. 
L 
M. 
N. 
I), 
P, 
(). 


,504 
,414 

,ivl 

m 

MA 

W 

M 
.544 
,120 
,524 
.175 
.115 
.IN 
.610 
.325 
.349 


-.011 

.956 

.964 
.846 
,516 

-.034 
,654 

-,404 
.832 
,474 

-,120 

-,197 
,775 
,014 
,539 


-.211 
.728 

839 
^839 
.841 

.311 

,030 
,652 

-,366 
,846 
.442 

-.170 

-.171 
.658 

-.065 
.63? 


-.387 
.631 
,478 

,920 
,956 

117 
Mi 

,172 
,791 
-,250 
,863 
,624 

■ 017 
,825 
,122 
,643 


.180 
,607 
.387 
.322 

,912 

119 

.179 
,787 
-,161 
.876 
.596 
.003 
.052 
.786 
.165 
,625 


.052 
.518 
,277 
,190 
,571 

, JtfU 

.234 
.777 
-.174 
,900 
,626 
-,110 
-,067 
,832 
,217 
,641 


,267 
,372 
,034 
,055 
,570 
,563 

,032 
,241 
-.325 
.384 

„;i2 

-.1)91 
-.271 

,258 
-.118 

,328 


,086 
,741 
,673 
,389 
,570 
,342 
.345 

.283 
,417 
.200 
,504 
,164 
,276 
,170 
,413 
,046 


-,008 
,822 
,731 
,467 
,"59 
,609 
,347 
,732 

,095 
,750 
,514 
,012 
,345 
,743 
,198 
,592 


-,053 
,748 
,683 
,571 
,421 
,556 
,284 
,610 
,664 

-,209 

-,100 
,343 
,568 

-,086 
,277 

-.199 


,067 
,846 
,682 
,482 
,653 
,405 
,399 
,801 
,872 
,677 

,526 
-,065 
-,102 
,773 
,153 
,645 


-,141 

,554 
,605 
,409 
,383 
,297 
,096 
,436 
,625 
-,359 
,553 

-,201 

,576 
,627 
,330 
,181 


-,512 
,280 
,506 
,553 
-,066 
-,057 
-.241 
.123 
.271 
.301 
,195 
,448 

,349 
-,127 
,144 
,221 


-,141 

,840 
,836 
,537 
.491 
.389 
,198 
,701 
.119 
,773 
,785 
,621 
,433 

,019 

,116 

•,J"7 
— 


-.077 
.895 
.743 
,588 
,609 
,476 
,407 
,938 
,820 
,820 
,872 
.577 
.341 
.905 

,:i4 
.4i'i 
..... 


-.220 
.840 
.783 
.580 

fit 

.544 
.614 
.286 
.665 
.845 
.832 
,771 
,645 
,;99 
.f52 
K 

.129 


-,012 

,782 
,654 
,518 
,4I» 
,427 
,376 
,686 
,723 
,763 
,759 
,497 
,401 
,670 
» 

.. 


i: 

10 

11 

12 
13 
14 

15 
16 
17 




K 


B 




D 


E 


F 


G 


11 


1 


J 




L 


M 













TABLE r 

LotdiD^ on rotated factors underlying agreement scores 



Type 



III 



Subj. 



RotateiFaooiiI-oadinos' 



NoKVAU 



,35 

.11 

-,12 
.43 
-.27 
-,13 
,46 

■1 
-,52 

.32 



III 



-.36 
.10 
,19 

■M 
,94 
,20 
,15 



1.00 



Tjpe 



'+iir 



PATIEira 



Subj. 


r 


ir 


111' 


IV' 


— 


D 


.98 


-,01 


,03 


,03 


,91 


F 


.91 


,01 


,05 


,15 


,91 


E 


96 


,06 


.05 


-,05 


.92 


K 


.91 


-,0I 


.10 


-,01 


,84 


B 


,88 


-,21 


,05 


-,21 


.90 


c 


fifi 


-,20 


,01 


-.30 


,89 


0 


1 


,03 


.11 


.11 


.14 


I 

1 




,41 


,09 


-.02 


.81 


k 


M 


.11 


-.20 


.24 


,59 


Q 


1 


,08 


-.31 


.19 


,63 


M 


-,«1 


.83 


,11 


.05 


,11 


] 


-.16 


,M 


-,03 


,29 


.59 


P 


,11 


.16 


.10 


,60 


.42 


L 


.54 


.02 


.69 


.22 


,81 


G 


14 


.32 


.Ti 


-.21 


.35 


H 


,21 


-.48 


,45 


,10 


.49 


M 


-,01 


,41 


-,38 


,09 


,32 



H 



'18 



18 19551NVimiONALCONFERENCE 



CHART U 



Factor Loadings of Type I and Type Q 
Normals on Factors I and II 



.75 



.50 



.25 



H K 



4-— < 1 



1.00 -.15 -.50 -.25 



.25 .50 .75 1.00 



-.25 



-.50 



-.75 



'1,00 



TESTING PROBLEMS 



123 



CHART 4B 

Factor Loadings of Type I' and Type 11' 
Patients on Factors I and U 



1.00! 



.75- 



.50 ■ 



.25 • 



-I — 
-.50 



-i — 
-.25 



— tll 
1.00 



■1.00' 



-.75 



.25 



.50 



.75 



-.25 ■ 



-.50 - 



-.75 • 



-1.00^ 



Since these factors are merely for the purpose of classifying the 
patients into homogeneous subgroups, their nature and identity are 
of no consequence emd as soon as we have established the subtypes in 
our two samples of normals and abnormals — the factors can be dis- 
carded. Thirteen of the patients showed a signiBcant loading on only 
one factor, ten of them on Factor I, 2 on Factor II and one on Factor 
IV. One patient showed significant loadings on Factors I and III, and 
three individuals were mavericks, showing no significant loadings 011 



^20 



124 



1955 INVITATIONAL CONFERENCE 



any of the factors. In the normal group, fifteen of the seventeen indi- 
viduals showed a significant loading on one factor, 11 on Factor I, 3 on 
Factor II. (one with a positive loading and the other with a negative), 
one on Factor III. and the remaining two had significant loadings on 
Factors I and HI. It is not profitable to pursue this analysis further 
except to indicate that this technique permits us to subdivide a Sarge 
group into like-minded or like-structured subgroups, regardless of the 
number of variables involved and regardless of the types of distributions 
that characterize them. It is a type of distribution-free factor analysis. 
I prefer to regard it as a method for typological analysis. The next step 
is to find out what the various subtypes have in common and this can 
be done by studying the coninion properties of each of the subgroups 
either with reference to their response pattern or other characteristics 
such as vital statistics, socio-econoniic background, genetic factors, 
etc., etc. 

SUMMARY: 

I have tried to point out 3 major issues: 

1. That the contrast betwe^en actuarial and clinical prediction is an 
unwarranted one. Instead, the two types of prediction supplement each 
(jther and the discrepancies between the two should be studied for im- 
proving each other rcKriprocally. Meehl has pointed out that behind the 
clinician looms the shadow of the actuary and that the latter like the 
undertaker will have the last word. I doubt this. For behind this actuary 
is another clinician looking over his shoulder to see just where the formula 
fails and behind him is a new actuary to see whether the corrections 
introduced by the clinician hold. etc.. etc. I would like to make a plea 
for the clinician to leave "hunch-space" long enougli to see how his 
hunches hold up and for the actuarian to leave hyperspace long enough 
to see whether his canonical formulas are applicable and what modifica- 
tions they need for meeting the demands of the clinic. 

2. Secondly, there is a need for more attention to the statistical 
problem of the evaluation of the individual case. The next break-through 
in our field is clinical statistics — the gearing of our powerful methods 
to the consideration of the individual case. 

3. Thirdly^ there are signs on the horizOA! that some type of break- 
through has already taken place. The emergence of interest in pattern 
analyses or typological analysis is beginning to make a dent in the 
interaction between clinician and psychometrician. By providing like- 
minded or like-structured subgroups, it becomes possible to apply 
present-day statistics to homogeneous groups in our clinical population. 
This is the first step in the rediscovery of the individual. Our second 



TESTING PROBLEMS 



125 



most important problem today is to find the pertinent variables for 
classifying the groups into homogeneous subgroups. Here a reorientation 
in psychology is called for. But what are the pertinent variables for 
the description of man? Factor analytic methods have attempted to 
answer tuis question. Factor analysis, however, has been applied largely 
to the conceptual responses of man. The psychomotor, nensory and 
physiological levels of response have been hardly tapped in factorial 
studies. But the perceptual and conceptual functions are largely de- 
pendent upon man's past experience and to a lesser extent on the im- 
mediate '*here rind now" ellVcts of brain function. 

As long as M'e limit ourselves to the perceptual and conceptual levels, 
we could regard man as an empty organism. When we begin to examine 
the behavior of patients we often find that the conceptual area is rela- 
tively intact. The functiofis which have been ingrained in the individual 
are generally unaltered by shock therapy, psychosurgery arid by the 
disease process itself! The physiological, sensory and psychomotor 
levels, and the stimulus-bound perceptual level, reflecting as they do 
immediate brain functioning, are more pertinen* for d(»tecting the de- 
viations of the mind. When we develop belter teclwii(jnt»s for tapping 
these functions, and apply suitable* individual-cetilered statistical tech- 
niques, we may resolve much of tlit» conllicl that now' (exists between 
the clinic and the laboratory. Jnst to titillate yonr appetite for such 
a classilication, the last chart shows a suggested outline (2). 



122 



126 



1955 im iTATIONAl. CONFERENCE 



TABLE 8 

Examples of measurable activity related to behavior categories and 
stimulus classes. 



STIMULUS ORDER 



1 

( DlKTURBANCKS 

or II(4MK<i(rrA«(iM) 

s 



l^eVEl, OF 
(>IU<KRVKD 

Hkiiavior 



('oN< r PTT-'AL 



INychomotoii 



n 



Skns<iIiy 



I) 

Statk) 
S 



Reverie 
and Phan- 
tasy 



SpoiiLa- 
»t*oua 



VATT; ItLHulin 
sh<jck; Lower- 
ing of oxygen 
tension 

Amnesia. Diii- 
orientation. 
Psychfflogica) 
te»t performance 



Seizure 



Spatial 

teni|>«4ral 

orifiiLati^m 



NoviKuitie 



Harkgrtiund Anesthe*' i 

cortical j 
gray 



Memal 



KfTect on 

viHua) 

oficmLation 



I 



I*MYM|0UM«|(:AI. 



ii>|H'rveiili- 
lutioii 



i:i:<.; Kasal 

i PGR I 



il 

(Inapphopwate 
Stiml'u) 
S 



tHertrical stimu- 
lation of tempo- 
ral cortex 



Memories 
Dreams 



KlectricaJ 
stimulation of 
rooUir cortex 

Movement of 
limb. etc. 



LSD 

SynuehtheHia 



Pressure stimula- 
tion a^Mive retina; 
Electrical stimu- 
lation of thermal 
receptors 

Phosphrne 

Warmth or cold 
M'nsation 



SliiniiftitKMi hy im- 
plantt J el»*rtrt)des 

(Change in hhxid 
sUToid patt4*rn 



m 

(Al 'OpRIATE 

Stimuu) 

S 

Smelling a "snifT 
set" 



Reco^uitioi* ^( 
familiar odtir 



Painful stimulus 



Arm withdrawal 



Rotating Benham 
disk 

Subjective color 
experience 



Light of graded 
intensity 



Threshold re- 
sponse 



Photic driving 
Effect on EEC; 



TESTING PROBLEMS 



127 



TABLE 8 {Continued) 
Examples of measurable activity related to behavior categories and 
stimulus classes. 



STIMULUS ORDER 


Level or 

BCUAVIOR 


IV 

(CoNFICUItAL 

Stimuli) 
S 


V 

(SiGMS) 

s 


Vi 

s 


CopfCEPTUAL 


.\irrraft f»>rnw or 
identity uf furmn 


Claasical delayed 
response stimuli 
in animal experi- 
mentation 

Successful response 
hy animal subject 


Word association 
test 

Associatior. to 
stimulus words 


PmrcHoufiToii 

R 


SUr-nhapetl niH/f 
MitTffr trat'itiK 


WagpiriK tail, 
nuzzling (dog) 

Petting hy human 
observer 


Psychiatric inter- 
view 

Electromyographic 
response 


PeACEI^I AL 

n 


\ isual furniM 
Di^riminntion 


Uftual vt^ual al- 
ternatives in ani- 
mal discrimination 
experiment 

Selective response 
r»f animal subje<'t 


Musical tones 

Pilch discrimina- 
tion 


n 
n 


Patterned liKht 
stimuli 

Vinual threnhfjld 


Iiifant'H faint cry 

.\f(»ther*s auditory 
thrf*fthold 


Wfirds or sentences 
ViHual threshold 


Patterned vImihI 
iitimulation 

F.fTei t on KK( ; 


fiell-ringing in 
Pavlovian condi- 
tioning 

Salivntion 


Verbal instructions 
to prevaricate 

Effect on P(;R 



121 



128 



1955 INVITATIONAL CONFERENCE 



You will note that the left hand column lists the five varieties of 
responses while the upper row lists the seven types of stimuli which 
ran elicit these rtsponses. Thus, in the idling state, in which no experi- 
m«>ntal variable is introduced, man is capable of emitting physiological* 
sensory, perceptual, psychomotor and conceptual responses. Such re- 
spf>nses can also be elicited by disturbing man's idling state in some 
controllf^ fashion, or by applying an inappropriate or unusual stimulus, 
an approprLite stimulus, a ronfif^ral stimulus, a sign stimulus, or a 
symbol stimulus. Mf>st of ')ur tests have been limited to this upper 
row -in fart to this last rubric — in which a symbol stimulus elicits a 
conceptual ri»sponse. L'ntil we sample this whole table — this behavioral 
MendflejefT tabli- if you will, our understanding of personality, be it 
of the ill or of ihr well, will be mighty limited. 

REFERENCES 

1. liBONCiWHKi. J. Thr citninuin M^nif />/ Mcirrwr. Caml»ridge: .Mass., Harvard Uni- 
vrriity Pre^'*. r>j3. 

2. BuHDocK, E. I. ANr> /rsis, J. \ rationale for the ctajvsification of expennientat 
UThiiiqiU'H in HlinoriTiai pflychology. J. Gen. Psychol, (ia press). 

.3. l>AVfs, l\. J. A rnni|iariMiri of the Htahility of the oieaHures of critical flicker- 
fiinion nwdv two difl'^'n-nt liKht-dark rati(»s as provided by the episocotiHter 
and the !*trol>. lac. Master Mnsay, (%>tiirnbia Uaiversity, 1952. 

i. Hrxr.KY. J S. Intrinlurtory : Towards the nrw systematii*s. In Huxley. J. S., 
E<Jiti>r. T/m* i\ru' SyaiettmiicM, Oxftjrd, 1940. 

"» K A. IVfiuition and speciOcatitjri of oieariinK. J. PhibiM. 1946, ^J, 281-288. 

h I.ANnrH, (\. AND ZrHiN. J. The PetMonal Intenlory Farm. New Y<jrk Psychiatric 
IfMtitdte. WS\. ffnr e<jK-rirm*ntal use only). 

T l.ANorH. (*.. ANii /.i iirN, J. Thr elTiTt of thonzylaminr hydrochloride and pheno- 
hfirhttal •wMlium on rrrlain p«yrhoiii(^ical functions. ./. PsyeruU. 19.51, .?/, 

». \U:niir , P. K.. i 'lmirnt ra. ^StaiUtital /'rrdictioru A Tfirftretieal Anafytis and a 
Kfruu- <»/ thr h'rulrnrf. Minnrapoli.^: I niv. nf .Minnrsotn Press, 19>t. 
Mku HKNh^f'ti. 11. h'Tfterience and Prnliclutn. ('-hicn(<«», 1938. 
10 /i HrN. J. S<M ii.|.iol<i(ri<-aI tvp*'s and nirthinls for thrir isolation. Psychiai. 1938. f. 

It \ lrrhni«|nr for MrasuriiiK UkeinindrilnrsH. ,/. Ahn, Sf<. Psychol. 



TESTING PROBLEMS 129 

Clinical Versus Actuarial Prediction 



LLOYD G. HUMPHREYS 



In the preparation of this paper on clinical versus actuarial predic- 
tion* it occurred to me that a third type of prediction might be recog- 
nized. I refer to the prediction of responses as a mathematical function 
of stimulus situation and organism. Wliether this constitutes a third 
case or is to be subsumed under actuarial is of course a matter of 
definition. Many psychologists would prefer to make this separation. 
.\etuarial prediction would then be restricted, if we use S[>ence*s (4) 
ti*rnunology. to response-response lelationshlps. This is at any rate the 
class of actuarial prediction with which my p ct is concerned. One 
clinical authority has recently termed this the "engineering" approach. 
I inferred that he thought of it as a term of opprobrium. I do not find 
it so and am happy to have this approach referred to as such if you 
find it meaningful. 

In the discussion of clinical prediction I shall restrict myself largely 
to the situation in which a clinician or counsellor after little acquaintance 
with the client, with or without test scores, intuitively predicts some 
future behavior or status for the client. This may not be fair to the 
clinician but it goes on continually in every clinic and guidance insti- 
tution. This ruk^ out a second situation, in which clinical predictions 
an* nuuh* in thenipy while the clinician is gradually forming his hypoth- 
i*s«*H nlxiut the patient. This latter activity is legitimately a human 
iH'tivity nnd >s not to be assigned to a machine. It is also clearly pro- 
f(*HHionnl in character and Ls nut to be assigned to a clerk. The profes- 
sional task, howi'ver, is to cure the patient; the position of this activity 
in tin* development of sci<Mue is as a source of hypotheses to be tested. 
It is not a dependable source of knowledge about human behavior. 

Hefore j)nx*ee<ling with the main part of my discussion, it might hi* 
pointni out that Meehl (3) underemphasizt*d one important functi<in 
of the therapist in this s4T0nd situation. In addition to hypothesis 
formation on the part of the therapist, evaluation of traits not presently 
measurable or not well nieasurwi — an ability shared with most other 
piNipltf who know the patient well — also takes place. The therapist's 

*1 linvf {niU'ti lit Kivt? individual rnnJit in the diHC'iissMm Ui folluw to any of niv 
rollciifOK'^ in th»* IVr>w)nn»*l Hmirarrh l^lM>ratnry (Air Force iVrsfjnnel and TraininK 
ll«*Mi'Hrch (Vntfr. lackland Vir Fnrre Ba.-wO IxHransp s<» many have mntributinl tii 
U*th data nnd idi^as and InthiI'M* in a fn'oup r(>m*arch orgnnization it in diflicutt to 
amigii specific credit. My debt should iiovi'rthrl<*8A U* recognized. 



130 



1955 INVITATIONAL CONFERENCE 



ability to predict the behavior of the patient during therapy is in part 
due to trait evaluation and only in part to the formation of hypotheses 
about trait combinations and dependencies within the patient. This 
being true, if 1 were a clinician— if 1 may speak hypolhetically for the 
moment— 1 would want to see my patient in many situations, not 
merely those involving a couch, in order to obtain maximum breadth 
of bi^havior sampling. 

With respect to the Hrsl situation outlined, 1 have never had any 
theoretical a priori expectation that clinical prediction could success- 
fully compete with actuarial predictions. Given valid tests and a valid 
prociHiure the clerk and machine should be superior. Fortunately, the 
evidence surveyed by Meehl supp<jrts rather strongly this conviction. 
Thi» issue is fiol a serious one, as ftir as I am concerned, on either theo- 
retical or empirical groufids. This includes Miss Anderson's Eniploy- 
nieiit Service Counseling, In clinicnl practice, however, it may still 
constitute nri imporUint problem. 

It is iMsy lo undrrsliind why it is a problem, why it is that attempts 
jiri* made lo second-guess lest n»sults by anyone engaged in individual 
pn^iiclitiM. For this I have no pat dynamic explanation based on the 
p(*rsonality structure of clinicians, other than the belief that they are 
motivatt*d to do a go^xi job. I am referring instead to the size of standard 
errors of <»slimate. There is strong niolivation here alone to find ways 
of improving on the information furnished by the b<^t of tests. 

What are our hopes of improving on present actuarial predictions by 
statist i' id ni<'ans and thus dt*(Teasing the clinician's motivation to do 
lhi» irnpossihie? There an* s<»nie obvious things lo be done, there are a 
frw things that arc perhaps not so obvious, and there are, I am sure, a 
niinibi'r of things which are y*'t to bi» discovered. 

Ill tht' first placr, wt* can pay more attention to the reliability of our 
i ritiTia. There is rerlainly no point in looking for additional variables 
or Iw'tti'r nii thrxls of ronihinalion if variability about the predicted 
critrrion seor(» is largely measurement error in the criterion. As a matter 
of fact we ran with riear conseirnee correct our correlations for un- 
reliability of the rriterioii in evaluating the quality of the pmple wc 
place in jobs in a s<'lcction or guidance program. 

Heliability tells only part of the story. Specificity in the criterion, 
which is ordinarily considered a part of the reliable variance, is im- 
portant also. For example, correlations between independent raters 
concerning a subjwt a olTiccr quality are substantially higher for a given 
situation at a given moment in time than if situation and lime vary, 
Tlie degree tr) which olTiccr quality in general is predictable, however, 
is a function of the size of correlations between raters when time and 



TESTING PROBLEMS 



131 



•situation vary. This is not to say that how a man will be rated by a 
particular supervisor on a particular job isn't potentially predictable. 
It does mean that for purposes of evaluating the general Ijait and for 
techniques that give us information about the ratee only, this situa- 
tional s(>ecificity should be allowed for. It should also be clear that in 
order to increase predictions in specific situations we shall need informa- 
tion both about rater and ratee, and knowledge of how this information 
is to be combined. 

We have also been careless about our criterion measures with respect 
to their homogeneity and comparability for all persons in the sample. 
Factorial complexity imposes no problem if it is uniform in the sample. 
But look for a moment at predictions of freshmen grade point ratios 
in which we typically lump students from all colleges of the university 
taking dozens of different patterns of courses from further dozens of 
instructors into a single criterion measure. In addition to the functional 
complexity of the criterion which varies from one part of the sample to 
another, we run into problems of lack of comparability of the units of 
measurement from subsample to subsample. Note that these difliculties 
with the criterion do not affect its reliability if a student is consistent 
with respect to his choice of curricula and instructors. I knew one univer- 
sity, for example, in which engineering grades were below the campus 
average but in which the average engineering student was one standard 
deviation above the average of the rest of the campus in quantitative 
ability, two-thirds of a standard deviation above in verbal ability. 
Over-all correlations with grades were markedly attenuated. A good way 
to reduce this kind of error is to correlate predictors with separate 
course gra<l<»8, obtain intercorrelations of the course grades, and then 
predict any pattern of coiirsis df»sired. One typically finds, for example, 
higher correlations with single course grades than with grade point 
ratios. 

On th<» test si<le it is obvious that We ne<'d better and additional 
nieasun^s of ps>rho)r)gical traits, particularly in the motivation and 
t(>mperament areas. Our b<»st prwli<*tions of later officer quality, for 
pxample, are made from personality trait ratings obtained from peers 
early in training. This dovs not give us a convenient flexible measure 
for yiMt in a selirtion program. One encouraging sign, however, is that 
we are able to obtain differential validity for trait ratings by these 
same peers. This finding furnishes l)oth hinU and hopes for future test 
construction. 

We have also been looking for additional variables in another, perhaps 
unusual, way. We have checked comparability of regressions of testa 
and criteria for different bio-social groups. The typical finding Ls that, 




132 



1955 INVITATIONAL CONFERENCE 



when differences occur, the lines are parallel but the intercepts differ. 
Females, for example, frequently have higher criterion performance in 
technical training, test score for test score, than do males. Other dif- 
ferences have been discovered for geographical areas, I do not believe 
that we should try to adjust for such differences by doing something 
to the norms. For one thing it is not apparent that these regression 
differences occur only on tests on which a random sample of females 
Hcotc a compensating amount lower than a random sample of males. 
Neither 's it necessarily true that such regression differences are con- 
stant for a given test. It is better to view this flnding as evidence that 
an important variable on which males and females differ has not been 
measured. Lritil the variable can be isolated some improvement in 
pnxiiction can be obtained by weighting sex in the prediction equation. 

I won't belabour further the search for additional predictors. A 
survey of the fleld would be too time consuming. A less obvious point is 
that for all kinds of tt^ts we need either a tailor-made job or at least 
the l)est possible flt for the group at hand. Not only must the right 
abilities measured, but the test must be of appropriate difficulty, 
with enough items of that difficulty for the group on which it will be 
u.swl. Appropriateness of difficulty level is not a statistical nicety which 
makes a difference of .01 or .02 in correlation coefficients. The difference 
in correlations with outside criteria between using the Armed Forces 
Qualifying Test, designed U) rover the entire range of ability, and a 
spi^ially di*sigru*d selection test for a group of officer applicants is 
rneasun*d in the first decimal, not the second. Tirne limitations on an 
all-purpose battery are encountered more severely in terms of using 
sufficient items of appropriate difficulty than in terms of including all 
the measurable functions necessary. 

.\ major group of problems in prediction (*an be described in terms of 
the ne<?d for congruern*e lK»tween pre<ii< tive <levices and methods of 
combination on the one hand and criterion measures on the other. 
A wi»ll discusse<l example is that the type of process involved, whether 
additive, conjunctive, or disjunctive, must be comparable for predictors 
and criteria. Most of the discussion has centen^l iimund the applicability 
of the additive assumption. Actually it seems to be a reasonably accurate 
nuxlel f<»r most kinds <if profn iency criteria. Tryout of other models 
should 1h* most profllabli* where we have signally failed to date, not 
where we have been relatively successful. This is not to say that present 
predictions of academic success, pilot proficiency, or other similar cri- 
teria could not l)e improved through the use of more complex e<iuations 
than pn»sent additive ones. I do believe, however, that gains will be 
small and difficult to (establish. Many pastures are far greener. 



TESTING FKOBLEMS 



Prediction of teaching effectiveness constitutes one example eminently 
suited for the tryout of other models. Perhaps psychological analysis 
should have told us this earlier, but the piling up of negative results 
has clinched the issue. 1 wonder if perhaps the process involved in this 
case is disjunctive. Further, I suggest that pattern analysis techniques 
may be the prrferreil method of combining variables under this cir- 
cumstance. 

A se<*ond example of congruence, or its lack, concerns two additive 
techniques. Multiple regression is efficient for the prediction of relative 
success in training or in jobs, but it is not efficient for the prediction of 
group membership. The multiple discriminant function is an efficient 
statistic for the latter problem. (It is interesting to note that John 
French discarded the lerhnique this morning because he selected an 
inadequate criterion and then brought the niultipir discriminate func- 
tion in again in trying to solve difficulties raised b) the use of multiple 
regression.) V<x'alional guidance couns*'llors are generally more con- 
cerned with future group membership than they are with potential 
proficiency. They would make fewer errors in prediction if they could 
apply the appropriate statistical model, it should be noted that this is 
not muil in a critical spirit — admittedly it will take several years of 
nisearch and education b<'fore we ran make effective use of this de- 
velopment. 

Two other types of lack of congruence constitute possible sources of 
attenuation of correlations with criteria. To use Coombs' (1) termi- 
nology if we mix relative and irrelative scales. (1 would also use ipsative 
and normative scales interchangeably with (loornbs* terms) or if we 
mix the tasks A and B, set for the subject in being measured, we attenu- 
ate th«' correlations involving such mixed scales. 

It appears to me that we have mixed relative and irrelative scalr's 
quite indiscriminately. A forced-choice scale of vocational interests is 
a gCKjii example of a relative scale, i.e., measurement is about the sub- 
j<»<'t's own mean. Most proficiency criteria, on the other hand, are ir- 
relative, i.e., mensuremrnt is about the mean of the group. If there are 
large across-the-lK)ard differenres in interest or motivational level for 
academic work, we eannot r.xpect to obtain vr-ry high correlations l.e- 
tween scores on a forced-choice interest test and grade point average. 
By analogy we cerUiinly wouldn't want to take across-the-board level 
4>^ut of our aptitude battery in predicting this same criterion. 

It is of interest to note that types are relative. Somato-type scores 
add to what is for all practical purposes a constant, i.e., all persons 
have the same mean. Kveryone has a high score some place, no one is 
low in everything, and then* are no persrjns who are high on everything. 




134 



1955 1NVIT\T10NAL CONFERENCE 



We might describe a perfect type for pole-vaulting, but if a given ex- 
ample of the type were only iive foot two he would not be able to vault 
as high as many faster, taller, and stronger men who did not quite fit 
the type specifications. Correlations between the type scores and the 
proficiency criterion would not be as high as a combination of separate 
measures of height, speed, strength, weight, etc. with that criterion. 

It is my impression that clinicians tend to think in terms of types. 
Perhaps the high and low points in a person's profile are more obvious 
in the individual interview than his strengths and weaknesses relative 
to a norm group. It might also be noted that most of the empirical 
comparisons of clinical and actuarial prediction have involved pro- 
ficiency criteria. I suspect that some of the astoundingly poor resulu 
from clinical prediction result from a combination of relative and ir- 
relative scales. 

It seems probable, as we look into this matter further, that there 
may be some important criteria that are themselves relative. If this 
were true, relative scales such as those based upon type concepts would 
predict more accurately than irrelative test scores. I wonder, for example, 
if perhaps decisions do not involve a balancing of factors within the 
person more largely than the strength of any one trait or combination 
of traits, in the normative sense. This problem can still be handled 
statistically, but we will not find the multiple regression equation which 
combines results from sevenil irrelative scales very useful. 

With respect to the task set for the subject in being measured, it is 
dear that thesi; should not be mixed and it is possible that they are 
mixed, willy-nilly, in many situations in which we are trying to predict. 
Task A of Coornbs involve s an ideal as the basic frame of reference. The 
!-iuhji»ct is free to s<»ltrt this ideal in many circumstances. Task B involves 
"vr luating a trait or component. For example, if we were to ask a sub- 
ject to rank 10 politicians in his preferred order. Task A is involved. 
Presumably he starts with his ideal as rank I and the further removed 
in any direction any politician is from the ideal the lower he is ranked. 
Now if we ask the subject to rank these same men in their order of 
libiTalisni, task B would be involve<l. Note that the relationship between 
the two scaltjs resulting from these different tasks is dependent on the 
position of the subject's ideal on the liberal-conservative continuum. 
(Jvcr many subjects the correlation between the two scales would prob- 
ably be close to zero. Do we have here a possible explanation for certain 
low correlations between tests and criteria? 

In asking this question, I am not as certain that tasks are frequently 
mixed as I am that relative and irrelative scales are frequently mixed, 
but the point is well worth investigating. Even if we ask the subject 




TESTING PROBLEMS 



135 



to assume Task it is possible that he will nevertheless be afTected 
by his ideal. Do criterion ratings frequently reflect this phenomenon? 
Does a consensus of raters merely reflect the average scales obtained 
from Task A? Needless to say, the low correlations resulting from the 
hypothetical circunistanres would not reflect unfavorably on the tests. 

In conclusion, for the situation in which a clinician sees a person 
briefly and makes intuitive predictions of future status or behavior, I 
see little hope for the iinprovfnient of clinical predictions per se. There 
is a good deal of improvement possible on the other hand in predictions 
that we are calling uctuarial. This improvement will not take place, 
however, without a good deal of research. We now have a situation in 
psychology in which we probably have more tests than there are psy- 
chologists doifig related research. One of the several important charac- 
teristics of this situatiofi is that it allows many degrees of freedom for 
the operation of chance. I would like to suggest to clinicians that they 
discard 75% of their test repertoire, perhaps by lot, that they declare 
a moratorium on the devf^lopinent of additional tests by eager doctoral 
candidates looking madly for a dissertation topic, and that they con- 
centrate on increasing the complexity of the nomological network, to 
borrow the terms used by Cronbach and Meehl (2), concerning the 
tests remaining. 



1. C!cKjMH8. Clyde II. A llwitry of nsycfioUHjical scnlinf/. Ann Arbor: Engineering 
Ht»s*-arch Institut*'. L'nivrrsity of Slichiguii. 1951, 94 p. 

2. ('RONiJACH. l.KK J. .\M> Mkkiii.. Pa 11. K. C!r)nstruct vtilrdity in psychological 
tests. Psychol. Hull , lQ.5.->. .52, 281-302. 

3. Mekhk. Pai;i. K. Unintl rfrstu .slatislicnl ft red id ion: a Iheoreticnl analysis and 
a revu'iv of the eridencc. Minin ujH)li-;: University of Minnt'sola iVeiis. 1954, 149p. 

4. Sf»ENi:E, Kenneth \V. Tht* postiilatcs and inelhods of hrhuviurisin. Psychol. 
PtT.. 1948. 55, 67-7B. 



Hi:FERENCfc>> 




136 1955 INVITATIONAL CONFERENCE 

Clinical Versus Actuarial Prediction 



PAUL E. MEEHL 



I found Dr. Zubin's empirical data very stimulating; but since they 
illustrate the use of stiiUstical method in typology and do not bear 
directly on the predictive efficiency question, I shall not comment upon 
them further. I am completely ballled by Dr. Zubins main theme: that 
the clinical-actuarial issue is a pseudo-problem ► I do not find anywhere 
in his paper a serious attempt at rigorously showing this, and it seems 
to me that he has clouded the issues by bringing in the interaction 
between the two methwJs in research work. This research interaction 
has never beefi disputed by anyone; all agree that clinicians do generate 
hunches and, ofi the other hand, that hunches in social science must 
usually be tested by statistical methods. But the title of this symposium 
is *'Clinical vs. actuarial predict iorit'' not **clinical vs. actuarial research- 
planning." I still maintain that given n finite set of data — tests or 
otherwise — on an individual patient, for whom a prediction is to be 
made, you can either hand the data to a clerk or you can hand them 
to a skilled clinician to think about. Surely this is a pragmatic distinc- 
tion of real importarice. Take a simple, concrete example. We have to 
decide whether a certain veteran is to be given intensive psychotherapy 
or not. 1 his is a decision-problem which is being faced in clinics all over 
the country at this moment. 

Does Dr. Zubin seriously assert that w<? raufiot distinguish between 
these two operations: a naive clerk filling in the values of a regiession 
e<iuation, and 10 clinicians talking around a conference table? Since the 
latter cosLs from 10 to liO times as much (VA rates). Dr. Zubin must 
have very dilFcrcnt notions about economics from mine. Of course, the 
**context of discovery" displays both methods. In my book I emphasized 
Reichenbach's distinction b<?tween the two contexts not once but several 
times over. In the process of constructing a mechanical prediction 
system, the huncht^s of clinicians are usually valuable (not always!) 
and sometimes indispensable. Pick the variables any way you please- 
using Freudian theory, blind empiricism, or clairvoyance. You may 
use either *'rationar' combining functions or choose empirically by 
blind curve-fitting from a wide class of equations. You may study the 
hits and misses intensively and qualitatively, hoping to get further 
hunches as to how the combining function might be improved. At some 
point, however, you move from thet rijsearch process to the practical 



TESTING PROBLEMS 



137 



setting; you are asked to apply the fruits of your cerebrations to a 
realistic prediction problem. Ai thai momenl, what do you propose to 
the clinic administrator? Do you give him a statistical table or equation? 
Or do you tell him to hire a clever psychologist who will think about 
the same data, case by case, and predict therefrom? The first of these 
solutions is, in daily practice, what I call actuarial, whatever its re- 
search history may be. The second solution is non-actuarial, even if 
actuarial information is part of the total data that the clinician has to 
•*think about." Which of these two procedures has the greater success, 
the larger hit-frequency, in daily decision-making? This is no academic, 
hair-splitting question; it is a practical question of intense personal 
significance to the suflering patient and of great monetary importance 
to the taxpayer. I find in Dr. Zubin's paper no demonstration that the 
distinction between clinical and statistical prediction is spurious. Ad- 
mittedly there art? a few borderline methods. But in general, any 
genuinely mixed method is non-actuarial; because the defining property 
of the pure actuarial method is that it is unmixed. The existence of 
borderline methods which are difficult to classify does not abolish the 
distinction (although to believe that it docs is one of the commonest 
of philosophical mistakes). We cannot say precisely how many whiskers 
it takes to constitute a beard. Any cutting point, as between 78 and 
79 whiskers, is arbitrary and subliminal. But we do not conclude that 
there is no point in distinguishing or that a distinction cannot be made, 
between a man who is '*cleari-shaven*' and a man who is "fully bearded." 
Dr. Zubin says the methods ^'complement each other." This sounds 
plausible and tolerant; but what does it actually mean? In some of the 
published studies the effect of allowing the clinician to adjust the actu- 
arial prediction is a shrinkage in predictive efficiency. That seems to 
me to be n dear case not of complementation but of sabotage. It is 
senseless to speak of romplernentation when there are tw<) procedures 
lK)th purporting to do a specifi<»(l task but onr of these procedures in 
fact performs the task better than the other, and even better than some 
mixture of the two prtK'Klures will perform it. As to whether a renlly 
rock-bottom, episleniological distinction am be made, this is a question 
of great technical complexity. I would warn everyone against thinking 
it an easy (|U<»stion, disposable of by a few pleasantries (such as, "the 
inethfKls complement each other"). Here is needed a thorough analysis 
using the technical t<x)ls of the logicians and mathematicians. I do not 
know where I stand on this one, and I have spent many hours discussing 
it with sfime of the ablest logicians and philosophers-of-science in the 
business. 



134 



138 1955 INVITATIONAL CONFERENCE 



Dr. Zubin quotes me as saying that the actuary, like the undertaker, 
has the final word; and he says he doubts this. He says that the actuary 
in turn has a clinician looking over his shoulder to *'see where the 
formula fails.** To which I must reply, so what? At this point, the 
clinician thinks he **sees** where the formula fails; but Dr. Zubin knows 
as well as I do that this is not the sort of thing you simply **see.'* We 
clinicians **see** a lot of things that are not so, if the verb **to see*' is 
used as Dr. Zubin uses it. The context in which I make that remark 
about the actuary having the *Tmal word*' makes sufficiently clear what 
I mean by this. It is really no mure complicated than the scientific 
principle that I assume we all share, namely, it is facts that check on 
theories and not the converse. That we will no doubt continue to make 
still further theories is irrelevant to this primacy of facts; with respect 
to a given theoretical or predictive claim, the facts do have the final 
word. I can therefore only recommend to Dr. Zubin that he re-read the 
passage from which he quotes, and ask him to show me specifically 
where the logic is defective. Jones says that he, using method J, can 
predict what will happen better than Smith using method S. If Dr. 
Zubin knows of some way to resolve such a disagreement besides keep- 
ing score on Jones and Smith, I should be fascinated to learn what it is. 
And keeping score — let*s be clear about it — is an incurably actuarial 
process. 

Now for Dr. McArthur. I gather he feels there is some kind of dis- 
agreement betweefi us, at least with respect to the significance of the 
available empirical studies. It is perhaps foolish (and not in the sym- 
posium tradition!) to say of another scholar's paper: "I agree with 
everything he says.** But I feel impelled to say something very like 
that about Dr. McArthur. And I don*t suppose we can cook up a scien- 
tific Tight if I insist upon agreeing with him. Let me here say something 
of a personal nature. I am deeply convinced that in my own therapeutic 
practice (which is about as psychoanalytically-oriented as one can be 
without labeling himself a "wild analyst**) I do things daily which the 
best electronic computer cannot begin to do. If I didn't think this, I 
would feel pretty guilty taking $10 an hour from my clients. I don*t see 
how anyone would even program a computer so as to make it use the 
raw data as I use them when I interpret a clieiit*s dream. It therefore 
bothers me that clinical psychologists seem to interpret my book as 
anti-clinical, and pro-statistician; actually, by far the larger part of the 
words in that little volume are devoted to refuting the Sarbin view- 
point. (If you doubt that, just count pages!) At Minnesota we'are cur- 
rently pre-occupied with designing experiments which are built to show 
forth the clinician*8 unique talents. And I am pretty convinced in 



J 35 



TESTING PROBLEMS 



139 



advance what the outcome will be; it will be that when a clinician is 
allowed (quoting Dr. McArthur), to "use the data of his choice, make 
the analysis of his choice, and make the predictions of his choice," he 
will look pretty good; not merely better than the actuary, but— more 
importantiy — capable of activities (e.g., open-ended predicting) which 
the actuary does not even pretend to try. So you see how close I am to 
the McArthur position. I, like him, believe that we clinikers do special, 
unique, unduplicable jobs of idiographic conceptualization, when Dr. 
McArthur's criteria are met by the task and its conditions. Therefore 
I want us clinikers to spend our high-cost time performing these kinds 
of tasks. Where do we get this time? Well, perhaps there are some other 
time-consuming activities which we clinikers currently engage in that 
do not meet the McArthur criteria, and in which, consequentiy, we are 
at a disadvantage. If the McArthur criteria are applied to perhaps 90% 
of the prediction tasks which are being daily altempted by working 
clinicians over the country, it is clear that they are not being met. 
The empirical studies I have surveyed (which now number over two 
dozen) exhibit a pretty uniform trend. It appears that in prognosis, 
given the predictive coriditions under which practicing clinicians usually 
have to operate, the clinician is largely dispensable or positively adverse 
to predictive success. Dr. McArthur seems to depreciate the importance 
of these empirical studies because he sees, quite rightly, that they don't 
meet his criteria. This puzzles me, because I feel that they are grist for 
his (and my) clinical mill. (He is wrong about Sarbin, whose clinicians 
had at least an hour interview with the subjects.) These 25 studies lead 
me to say, in effect, "Good! Just as I thought, when you don't meet 
McArthur's criteria, the clinician is beat out by the clerk. So, let the 
clerk take over these kinds of coarse prognostic and diagnostic tasks. 
He does it cheaper, and he does it better. I will then occupy my third 
ear (and Tompkins' souped-up Mill's Methods) with therapy and re- 
search." Part of this research will be using both methods in a comple- 
mentary way to develop an equation for the clerk to use. The Harvard 
Adult Development Study in which Dr. McArthur is engaged I classify 
as research. If he should propose utilizing the method he describes in 
the routine predictive tasks of working clinicians, then I wiD have to 
start asking him my usual mundane questions about hit-frequency and 
cost-accountuig. Further, Dr. Zubin and I will turn over the McArthur 
"clinical-introspections" to a super-statistician, just to make sure that 
with this clinical help in the research context, the actuary is still unable 
to cook up a mechanical method which will compete with McArthur's 
clinicians. I, like Dr. McArthur, do not believe that he could; but this 
is an empirical question. Don't forget— most clinicians would not have 



140 1955 INVITATIONAL C0NFE31ENCE 



expected the iinifomi trend of the 25 prognostic studies either. But in 
that non-optimal domain, it seems pretty clear that the clinician*s 
confidence in himself is unjustified by the hard facts. The research 
task for those who believe, as Dr. McArthur and I do, in the unique 
clinical powers ot the human brain, is to find out whether this belief 
is true, and in wh»\t i^ontextB it is true in a degret great enough to be of 
practical iniportar.re. 

Drs. Humrnreys and StinAird cleverly sent their papers to me after 
I had already dictated more than fifteen minutes of talk about Drs. 
Zubin and McArthur. Hut there is no point anyway in rephrasing their 
sound and insightful remarks, which is all that I could do. I have a 
disagreement here and there but it takes too long to develop most of 
these. I find myself unwilling to agree with Dr. Humphreys* view that 
we caiuH)t expect to improve thtwe clinical predictions that are based on 
brief exptiHure. There is evidence in the literature that people differ in 
their rliniral tailcnU: if w»» study the process carefully as Dr. McArthur 
and other n*rM'«rehers fsuch as (Jtige, Tuft and the IPAR group) are 
doing, we ahonld in* able to te«s<* out what is involved in doing it well, 
(n 19 H I rht^kt^tl un the Multiphanic profiles of the patients I chanced 
to M*»' walking d<iwn the hall of the psychiatric unit who appeared to 
mi\ at sight only. l4* W MMIM-psychopaths. During the year I spotted 
13 such: ill 12 cwm^ I was right. If it were important enough, we could 
Hurely learn mure about what I was ri*sporiding to; it must be some 
fairly rrudf* asiw^cts of dn^ss^ appearance, and manner, since I have no 
p«ychic |M»wiTs. And faicts about dress, appearance, and manner, once 
made explicit, arc pnM<iumably teachable. Dr. Humphreys refers to 
**^>allcrri analynis" i»f tt»st Hcon*H. Here lh a big gap in our knowledge 
that will not Im* lillini unless you Htatisticiaris quit telling us clinicians 
that Fisher or llotelling or F<ao and Slater solved this problem years 
ago. They did not. Therv Li* t«) my knowledge no convenient, practical, 
rigoniiH prfici-duro fifjr disimveriin^ the fnnction and weighting the 
variabli*H emerging fnwm a f7/mny-score test like the Strong, the Multi- 
phasic, or the [Rorschach. If will here and now. in the presence of three 
or four hundred p^itential takers, offer to name several different clinical 
problems involving dichotomous criteria in which a Minnesota-trained 
eye can sort out Multiphasic profiles better than any of these methods. 
We are currently studying one such Multiphasic task— namely, the 
discrimination of psychosis from neurosis. I expect the discriminant 
function to exrrl the fledgling cliniker< but I expect the skilled cliniker 
to do s'ill better. Better than all three (and a preliminary study shows 
this) will be an objective set of complex-pattern rules devdq[>ed by 
Dr. (trant Dahlstrom and me. Why am I so confident, a priori, of this 



TESTING PROBLEMS 



order? Becau*e the student clinician follows a near-linear and uncon- 
figured function, non-optimal weights, and low diurnal reliability for 
identical profiU», The discriminant function eliminates the unreliability 
and non-optimal weighta. The skilled cliniker employs a configural 
function, and in the case of MMPI this is so important that the super- 
imposed errors of iion-oplimal weights and unreliability do not wash 
out the configural gnin. Finally, the objective patlern-irileria are con- 
figural and the dtn Lsion is consistent from case to case. NorMjptimal 
weights remain with us. With a <)-variabIe system, and no underlying 
throry to suggest ii rational combining function, you would have 
9-1-9-1-36 = 31 parameters to *'t, if you w» nt past the linear discriminant 
fiimtion to a s*H'und-degree expression (with the all-important cross- 
products). Think, dear brethren, of the sampling errors you would be 
packing into those St constants! 

I think Dr. Snnford is riphl in suggt»sting thai statisticians and 
clinicians arc ri'all> iiilen'st»Hl in pn*dictiiig dilTt ^^•nt kinds of things. 
But I want to forc^- this out into the open, I)H ini>/' J insist that many 
working clinicians are blissfully misusing the cJiriic^l. method to predict 
the actuary's kind of thing. One program that I am sure all five of us 
can agnv to. and nrommend to yon as Ixith stimulating and socially 
significant n'st^arch. is tin* empirical study of the two methods of pre- 
diction under thi* various ^Ninditions set forth by the four speakers. 
For what kind of criterion, given what kinds of data, with how much 
rxp<isurc, in what se<iuencr, and so on and on, can ihe clinician (what 
ilinician?) rxcrl the actuary? There is room for many more studies 
trying various combinations of conditions b**fore we have the answer. 
And F should say "answers"; InTause it will hardly Ix' a decision as to 
who wins. Hather wr will have trustworthy information as to which 
predictive problem is ht^i handled by which method. Here I would 
like to go into the tremi»ndous matter of /orm versns conlenL which I 
now t«»nd to s«»e as the real nub of the business. Hut that would take 
hII night, sf) it will have to wait for anothiT time. 



TESTING PROBLEMS 143 

Appendix 

ParticipanU— 1953 Invitational Conference on Testing Problemn 



Adkinm, Duruthy (^, University «>f N«irth 
Carolina 

AoLEiunxiN. Arthur, i'riticetiMi Uuiver- 
aity 

ALex.ANDEn, Irving K., IViiict*t4m Tni- 
veraity 

Allen, Rathryn M., S'lienwlady l^lLlic 
Sch«MiU 

Allen. Margaret K., I'uhli«: Sch«M»U, 

Fiirtland, Maine 
\LU^^>N, Hfiger B.. Jr., Hducattdiuil 

Tenting Servio' 
^LMAN, John E,, B<jKUtii flnivenuty 
Anahtami, Anne. Fordham University 
\^DEa}44»N, Ficlwttrii K., Educational 

T«?!<itiiifC Ser^iw 
Anukhmin, <ior(lon V.. (jiiv entity of 

Texaii 

ANtiKni4<iN. Paulini* K., New York St«t*» 
Hmploynient Service 

ANtiEiv«>N, Hf>y N., North Carolina 
State CluUege 

ANDERjioN, T. W , (Uilumbia University 

Andber. H«>bert (■.. Brtxikline High 
Srhotjl, Ma2«Michu«"tt'< 

AwiELL* li<w>rge W., Jr » Kdurational 
Tenting Service 

\N(;orF. William H., Educational Tent- 
ing Service 

\NsiBA^Hca. H. L., University of Ver- 
mont 

AaM.«rraf>Nn. Fred I,4»high University 
AnoNOw. Miriam S . New York City 

Hiiard t>f Educatiim 
AiwENiAN, Selh. Springfii^liJ Cx>llege, 

MamacbusettA 
Bannon. C 'harlcA J..( -nmby High School. 

Waterhury. CU>nnecticut 
Babdack. Herbert D., New Y«>rk Slate 

Department of Gvil Service 
Bargman!^. Rolf E,. Univeraity of North 

CaroliM 

Barnu. Paul J., World Bo«ik Company 
Barmc Marguerite F., Vocatiooal Ad- 

viaory Service, New York City 
Baiuutt, Dorothy M,, Hunter CoUege 



Bartelme, Phyllis F., New Y'<»rk He- 
gi<mal Iie»pirat4ir and Hehabilitation 
Center 

Bartttik, Hubert V., Educational Test- 
ing Service 

BAiJKnrfrKiND, Robert H., Science He- 
search AtMociates 

Btcii, Hubert Park. CJty College t»f 
New York 

Bedard, J«jaepb A., Public Schools, New 
Britain, (xtnnecticut 

BiCLT. Sidney L., Educational Testing 
Service 

Bement, Dorothy M., NorthanipUjn 
School for Girls 

Bknua. Harold W.. New Jen«»y Depart- 
ment of Education 

Bennett. (leorge K., Th<* Psychologicral 
( .orporation 

Bknnett. Ralph, New York City 

Bkns(»n. Arthur L., Educational Testing 
Service 

Berdib. Ralph F., Uruver?tity of Minne- 
Mita 

Bercehen. B. E., Perm>nnel Press, Inc. 
Berne, Ellis, New York State Rent 

CommiiMU«>n 
Berrien, F> K., (leorge Washington 

University 
Black WELL, Sara, Educational Testing 

Service 

Ulaul. R. Elizabeth. Highland Park 

High Sf^hool. Illinois 
Bloom, B. S.. University of Chicago 
B4)AMi, Veninica M., Department of Per- 

mmnel. New York City 
Bogkr. Jack Holt, Richmcmd Public 

School 

Boldt, R. F.. Educational Tesiting Serv- 
ice 

Bollknbacuer, Joan, Cincinnati Public 
Schoob 

BooRBiNDBii, Murray. Personnel De- 
partment, Philadelphia 

BoRGATTA. Edgar F., Riinm41 Sage Foun- 
datioa 



139 



144 



1955 INVITATIONAL CONFERENCE 



BowKKii, Albert H., Stauford Univereity 
Braca, Suaan K., Archdioceaan Voca- 
tional Service 
Ukanot, iiymaii, American Oocupa- 

tionai Therapy AwKKiation 
Bkay, DouglaA W., Columbia University 
BaKTNAix, Dorui, EducatiuoaJ Recdrda 
Bureau 

HaiouMAN, Donald S., American Tele- 
phone and Telegraph ('^>. 

Hni.*ntiW. William II.. Bureau of Cur- 
riculum Hewarch 

BaoBHT. tlarr^ K . Oklahoma A & M 
< 'ollege 

Baoor.iih.n, J. Lawrence, YM('A Vo- 
cational ^Service Center 

BaoLYVR^ Cecil, New York Slate De- 
partment of Civil Service 

Ba(M>KM. Hichard B., C^iUege of VN illiam 
and Mary 

liaowN, Frederick S.. (ireat Neck l'ubli<" 
S'hiM>U 

Uryam, Miriam M . I'hr pHycholofot^al 

( !orporatii>n 
Bkyan, Ned, Hutict'r4 l<iiiveniity 
lii;(:KiNr;HAii, Guy K , Allegheny r^>||ef(r 
Hi/EL, William D., Temple University 
lluniMJCK, E. I.. Carnefne (AtTpt*rai'u*n 

of New York 
lUmnK, Jamea M., Danen lhiblicSch(M>U. 

Cl^mnecticut 
liuaae, Paul J.. Bell Telephone ljitM>ra- 

torien 

lii.a.NHAM, Paul S., Vale tiniversity 
Bi'MtM, Oscar K., Rutgen University 
hvHNR. Uma a.. Temple University 
Camprklu Donald W., Newark Public 
Si'hooLn 

CAHPti. Marian P.. S>utb (!arolina State 
C4>llef(e 

CAKLMiM. Harold S., Upnala C^illegf 
i Iaklmin, J. Spencer, I university of 
Oregon 

(MnnoLU John B., Harvard University 
Camtatbh, Euitena D.. Bureau of Naval 
Peranooel 

<*ArNB. Bernard S., Ginn and Coaipany 
Chacko. C^eorge, Educational Tenting 
^MTvioa 



CuAPPBix, Bartlett E. S., New York 

Military Academy 
CUAUNCEY, iienry. Educational Tenting 

Service 

CHMSTOPHEnsoN, Helen, Arthur C. Croft 

Publications 
CHUacuiix, Ruth, Antioch College 
CuvT, Norman, Educational Testing 

Service 

Cobb, William E., Pennsylvania State 
University 

CocKUN, John IL. Temple University 

CorpMAN, William E,, Educational Test- 
ing S'rvice 

(!:ouAN, Blanche, Educational Testing 
Service 

CoME^, Philip S., Montclair Slate 

Teachers College 
CoiJi, JoNeph W.. University of 

Rochenter 

CoLKMAN, Williaiu, U'niverMity of Ten- 
neittiee 

Cooper, Hermann, Stale University of 
New York 

Cox, ilenry M., University of Nebraska 

Oane, Percy F., University of Maine 

Oavkn, Ethel Case, Polytechnic Insti- 
tute of Br(K>klyn 

CnA^^PoKD, Barlxara, Educational Test- 
ing Service 

CntsMY, \\\ J. E., Personnel Develops 
ment. Inc. 

CniflwKLL, Joan H.. Office of Naval Re- 
search 

Clmminca. Mary B.. Boston Public 
Schools 

CiiaETCJN, Edward K„ University of 
Tennessee 

CuaETON, Louise W\, knoxville. Ten- 
nessee 

CuTTM. Norma E.. New Haven State 

Teachers College 
Cynamon. Manuel, Brooklyn College 
Dailey, John T.. Bureau of Naval Per- 
sonnel 

Daly, Alice T., New York SUta De- 
partment of Education 

Dammin, Dora E.. Educational Testing 
Service 



'to 



TESTING PROBLEMS 



115 



DAViDorp. M. D . r. S. Civil S^rvir# 

Committuon 
Davidmon, Ilelfii M., C.ily (AMvmt* of 

New York 
Davih, Fred Iluiitrr Colle^je 
DAviMiN, Hugh M., I^ennsylvaiiia Stat« 

I'nivenuly 
Day. Huberts., I'. S. MiliUry Acadi-my 
Dkan. K. D. \f .. F^lurRtioiial Tvnliug 

l>KCKKR, Krt^lrrirk. I'Miinitiuiiai 'h^nliiiK 
S^•^\ ice 

Dbtchkn, IJIy. iViiiiHvKaiiia Coll* irr for 
W< I me 1 1 

Diamond, M. David, HiviTnid*- H<mpi- 

Uil. Ni'W York 
Diamond, Lirraiiie K., Tfacher* ('ol- 

leife, (^ilumbia UniverHity 
DiCKNoN, CJwen S-hiifiiller, Silver 

SpriiiKH, Nfarylaiid 
DiRDKHicH, Paul li.. F^hu ati4m«l Trtsi- 

iiif; Servire 
DiKR.H, Helen A.. VfK*alioiial Advisiiry 

Service 

DiN«;iLiAN, David H., Iaih Kuf^vU'n (Vity 

DiuN, Hubert, ('alifiiriiiH Tent Hurrau 
DoaHiN, John E.. F^liiriitiuiial Testing 
Servirt* 

DoDi»M, Aliec*. luiuratifiiiul 'r<*«iling !Serv- 

Doppr.i.T, JiTuiiie K., The pHyrholofcical 

( '()r()«)ration 
[>RACM)HiTZ, Anna, lulucatiuiial Testiiif? 

Serv ice 

Drake, K,. I ni\erMil> of Wi?M?onHiii 
[>iiKHH£U Paul K., Michigan Stall? IJiii- 
V entity 

Di KRn, '^imi, BriKjklyii ('ollege 
Dunn, Frances K., Urf>v%ii rniversity 
Dunn, JiMwph F., Prudential luMurance 

Company of America 
DirnoffT, Waller N., Te«t S#Tvice and 

Adviaemcnt Outer 
DuBNo, Peter. Polytechnic Institute of 

BnM>klyn 

DinroN, EuRiMie. Cni^ entity of lUinoit 
Dyer, Henry S., FxJucational Tenting 
Service 



Fads, I^ura K . New York City Board 

of Kducatioii 
Kbku Bobert K., State l!iiiven*ity of 

Iowa 

EcKERT, Buth F., I Jniverwty of Minne- 
Hota 

Fdeijstein, J. David, Fdiicational Test- 
ing Si'rvice 

Fi>Eixn:iN. Uuth B.. Burruu of Child 
(iuiduiice. New York City 

I'1dhi>(.ton, T. Department of De- 
fense 

Fnoeijiart. Max D.. Chicago Public 
Sch(M)U 

Fpstein, Bertram, Cily College of New 
York 

Fpstein, Sidney, National Besearrh 
C^Hincil 

KsTAVAN, Donald. ICducational Telling 
S-rvicc 

KvKNsoN, A. B.» Department of I-xluca- 

lion. Alberta. Canada 
Fan, C. T,, ICdncationiil Trsting Service 
Farabaim;!!, Mary F., Department of 

the Navy 

Fahr, (Mt^rg*' C, International Buhi- 
nesH Murhim^s Corporation 

F\Y, Paul J., New York Slate Depart- 
ment of (*ivil Service 

Fkinber*;, M. B.» City College of New 
York 

Feij>t, l^)iiard S., State UniverRity of 
Iowa 

Fendrick, Paul, W eMtem Electric Com- 
pany 

Fknoi.ijoha. Ce*)rge M., Houghton Mif- 
flin Company 

Fenktermaciirr, Cuy M., Fxiucational 
Testing Service 

Feroumin, George A., Mc(jiU University 

Ferrim, F. L.. Jh . lulucatijmal Testing 
Serv ire 

FiKKR. Cordon. Test Besearch STvire. 
Inc. 

FiNDLEY, Warren Ci., Educational Test- 
ing Service 

Fink, AugtLst A., Jr.« Columbia Uni- 
versity 

FiNRLB, Bobert B., .Metropolitan Uf« 
Insurance C^^impany 



'41 



146 



1955 INVITATIONAL CONFERENCE 



FiDCUEii, Clyde L., Dt^partmeut of Edu* 

cation, Puerto nicr> 
Fi^AGAN, John r., American Institute 

for ftesearch 
Kletchsii, Frank M., Jr., Ohic» SUl# 

University 
Flemmjno, Edwin G., BurtiHi Bigeluw 

< Organization 
FijJTCMER, (:ar<j| Ann. ^:<Uard \V. Hay 

Jk AwiociateH, Inc. 
FuHLANO, George. New Vcjrk City Board 

of Education 
FoRBiSKrER, (Gertrude, West Side High 

SchcK>l, Newark 
Fox, William H., Indiana University 
Freas, Howard J., Jr., EducaticriBl Test- 
ing .Service 
Frkperjksen, Norman, t^ducational 

Ti-»ting Scrv ivv. 
FuEi^AN, Paul, FUlucalional T«-stiiiK 

Service 

French, lienjamin. New York Stiile 
Department of Civil .Service 

French, John \V., F:ducational Testing 
Service 

FiuEDBNBERu, F:dgar, Brcwklyn C^illegtr 
Friedman, .Sidney, Bureau of Naval 
Personnel 

Frlt<!HEY, Fred P.. II. .S. Department 

of Agriculture 
Fi;i.TciN, Iten6e J . Bureau of Curriculum 

Keoearch 

Fi RHT, Edward J„ Cnivernity of Michi- 
gan 

(UiJ^r.HKR, Henrietta L. Eilucational 

Testing .Servict» 
iiARPNRR. Eric F.. Syracase (University 
<iAVER. Frances. Kducaticmal Tesung 

Service 

Gbunk, Marjorie, The Psyrhologicjil 
i^'tjrporation 

Gbrbejuch. J. Haymond. IJnivemity of 
Connecticut 

(iiANORANOK, Salvatore C, ClifTaide Park 
Junior High School, New Jersey 

(;iooiNC28, Frank, Spring6eld Trade 
School. Massachuaetta 

GiKK, Helen M., College Entrance Ex- 
amination Board 



Glass, ;Vlberi A., The Signal School, 

Fort Monmouth 
Goddaro, W. a.. International Businem 

Machines Corporation 
GoDSHAUc, Fred I., Educational Tilting 

Service 

Goldstein, Leo S., Teachers College. 

Columbia University 
(;<KJDMAN, .Sanniel M.. Puerto Bican 

.Study 

GoHDtiN, Mary Alice N., Macy*8 New 
York 

Graham, Elaine, Bank Street College 
of Education 

Greene, Edward B., Chrysler Corpora- 
tion 

(iREENE, Paul C University of Illinois 
tiHUMS, Cecily, City College of New York 
iJRiJDEi., Begina, Teachers College, Co- 
lumbia University 
GuERRiERo, Michael A., City College of 

New York 
(;i;luk.sen. Harold, Educational Test- 
ing Service 
GusTAD, John W., University of Mary- 
land 

Haagen, C. Hess. W'ealeyan University 

Haoen. Elizabeth, Teachers College, 
(^lunibia University 

Haraerty, Helen, Personnel Besearch 
Branch, Department of the Army 

Haoman, Elmer K., Greenwich Public 
.Schools, (^jnnecticut 

Hau^ Boliert Cm.. Manter Hall School, 
Cambridge. Massachusetts 

Halpern, J4iseph B.. Perscjnnel Depart- 
ment, .Stamford, Connecticut 

Harmon, Ijndscy B., National Besearch 
Council 

Harper, Bertha P., Personnel Besearch 
Branch, Department of the Army 

Harter, Boger, American Telephone 
and Telegraph Company 

Hakhnos, J. Thomas, Univeraity of 
Illinois 

Hayes, Brwemary. Educational Testing 

Service 

Hkaly, Ernest A., Center for Psycho- 
logical .Service. Washington, D. C. 



I40 



TESTING PROBLEMS 



147 



Heath, S. Roy, Jr., Koox College 
HRATfiN, Kenneth L., RicbaitiBOD, Bel- 
lows, Henry and CompaDy 
Hnu Louia M., Brooklyn CoUeg« 
Heinemaivn. Richard F. D., Stewart, 

Dougall and Aattociatea 
Hejheh, Ruth BUhop, Goaben, Kentucky 
Hf.ijiick, John, Educational Testing 
Service 

Hklm, Carl, luiucational Testing Service 

HELMMTADTEn, Gerald C, Educational 
Testing Service 

Hem PHI LU John K., Educational Test- 
ing Service ' 

MERnicK, C. James, Rhode Island Col- 
lege of EUlucation 

HiEnoNYMUs, A. N., State University 
«>f Iowa 

Hi 1.1^ Walker II., Michigan State I)ni- 
versity 

HiLLH. J«>hn H., Educational Testing 
Service 

HiRMCH» Richard, Educational Testing 
Service 

HimNcBR, William F., Haller, ^fay- 
moiid and Brown 

Holland, John. Veterans Administra- 
tion Hospital, Perry Point, Maryland 

HoLLis, William H., New York aty 

HoLLi.HTBEk John S., Educational Test- 
ing Service 

lIouJCY, Clifford S., Personnel Depart- 
ment, Philadelphia 

HoRToN^ Clark W.» Dartmouth College 

HonriwiTZ, Leola S., Quci:ns College 

Horowitz, Milton W., Queens College 

Howie, Duncan, University of New 
Efigland, Australia 

Hubbard, John P., National Board of 
.Medical Elxaminers 

1 1 UDDI.KMTON, Edith M .. Educatioiial 
Testing Service 

HuGiiEM, J. L., International Business 
Machines Corporation 

HiTMPHBRYs, Lloyd G.. Personnel Ue- 
search Laboratory, lackland Air Force 

BliM 

HuwT, Thelma. George Washington Uni- 
versity 



Hunteh, Genevieve T., Archdioceaan 
Vocational Service, New York 

Jaspkn, Nathan, National League for 
Nursing. New York 

Jepfbby, Thomas E., University of 
North Carolina 

Johnson, A. Pemberton, Newark Col- 
lege of Elngineering 

Johnson, M. C, Educational Testing 
Service 

Johnson, Theron A., New York State 

Department of Education 
Jordan, Arthur M., University of North 

Carolina 

Kaback, Goldie R., City College of New 
York 

Kaun, Robert, Educational Testing 
Service 

Kalubach, H. Lynn, Columbia Public 

Schools, South Carolina 
Kaplan, Bernard A., New York State 

Depeulment of Education 
Kblton, John D., University of North 

Carolina 

Kendrick, S. a.. College Entrance Ex- 
amination Board 

Reppicb, Charlotte, Standard Oil Omi- 
pany (New Jersey) 

Keith, A. II., Putnam, G.nnecticut 

Kehn, D. W., University of Bridgeport. 
Connecticut 

Kbrnan, John P-. Vick Chemical Com- 
pany 

KiDD, John W., Northwestern State 0)l- 
lege, Louisiana 

Kimball, Elizabeth, Educational Test- 
ing Service! 

KiPNis, David, American Cancer Society 

Kleidman, Ruben, Brooklyn College 

KuNE, WiUiam E., The Choate School, 
Wallingford. Connecticut 

KuNO, Frederick R., Educational Test- 
ing Service 

KooAN, I^nard S., Community Service 
Society, New York 

KoGAN, Nathan, Harvard University 

KosMERL, Alice, Washington, D. C. 

KrathwoHL, David R., University of 
Illinois 

KiJBis, Jc«eph F., Fordham University 



143 



148 



1955 INVITATIONAL CONFERENCE 



KcHHNEit. nose K. City Cnllf|ri* of NVw 
York 

KvARACEL'M, William <*.., B<MU>n Tni- 
veraity 

Lambert, Joan, Educatiuiial 1 t*8titig 
Service 

I^MKE. T. A.. Iowa State Teachen* 
College 

I^NCiMLin, (*. H.. The Psychological 

(!orporatioii 
Lanniiolm. (i. \ ., Ivduratioiial Ti'Ktiiig 

Service 

Layt«)N, Wilbur L., ( iiiversity «>f Miiini'- 
sota 

L(>i;(;iiKRY. (^ertriuJe M., Church Street 
S<;h<M>l. IIani<leii. (Connecticut 

I.KNNON, Hogrr '!'.. World IUhA Coin- 
|>any 

Lkvkrktt. Ilollis M.. AniMricHM Optical 
(!4Hn|>any 

t.KViNE, Hirhani, Miiucaiional IV^ting 
Service 

LrHTO.N.sTEiN. Malph. Department of 

iVrsonnel, New York City 
I.iNDRER<:, Liicile. QtieeiM Collrge 
LiNCH^iisT. K. v.. State T;niverHity of 

Iowa 

LiTTCRiCK, William S., The llarle> 
School, liorhenter. New York 

I^)iiiiAN, Maurice A., University of the 
State of New York 

lif>Nf;» LouIm, City College of New York 

liORD, Krwleric, Kducational Te^iting 
Service 

lx>RD. Shirley If . F^iucational Testing 
Service 

Ix>R(:k, Irving, Teachen* (iollege. Co- 
lumbia University 

Lr<:KEY» ISertha M., Cleveland Public 
Sch(M>Ls 

Li.'sK, l^>uiM T.. Norwalk. Cxiunecticut 
Li.TZ. Orpha M. Stati* Teachers 

College, Monlrlairt New Jersey 
Lynauoii, M. H.. W eHtem Electric (^mi- 

pany 

Lyonsi, William A.. New York Stati* 

Department of Kducation 
Macui. \iucent S.. Alfred Politz He- 

w«rch. Inc., New York 



MacKay, Jam<« I.., South San Antotiio 

Public Schools 
MacPhaii.. Andrew II.. Brown I'lii- 

versity 

Magoon, Thomas M., Lniven»ity of 
Maryland 

Maltby, Jane M., Board of Education, 

I lamden, C^mrKH^ticut 
Mandbi.i., Milton M.. i'.S, Civil S<»rvice 

Conmiission 
Manueu llerM'hcl T.. I iiiversity of 

TexaN 

MAnyt'i.s, IJoyd K.. Arthur C. Croft 

PublicatioiiH 
Marsh. Mary M., EducationRl Testing 

Service 

Marston, Helen M.. Etlucatiorial Test- 
ing Service 

Mamtin, Harold F\. International Busi- 
ness Machines Cor{x>ration 

Mathewmon, HolM^rt H., Division <»f 
Teacher Education, New York City 

Max^son, (iiH^rgia, Flducational Testing 
S'rvicv 

McAnnn M. Chi«rl«*s C.. Ilarviinl (Uni- 
versity 

McCabk. h .ink J.. Metropolitan IJfe 

Insuranct? (Company 
M(:('aij^ W. C... University of Siuth 

Carolina 

McCAMRMirHiE. Barbara, Educational 
IV'sting Service 

Mr.C.ANN, Forbf's E., Personnel Depart- 
ment, Philadelphia 

.\ic('ui.LY, C. Harold. Veterans Ad- 
ministration 

M(:(iii.i.i(:i'DDY. Marjorie. New York 
State Department of Civil Service 

McIntire, Paul II.. University of New 
Hampshire 

McQuiTTY, John v.. Uni>ersit> of 
Florida 

\lKniJ!Y, Donahl M., Municipal Colleges 
of New York 

Mkeiii., Paul E.. University of Minne- 
sota 

Mrmjnc.er. J. J.. University of North 
Carolina 

Melville. S. D.. Educational Tenting 
Service 

'^7 



TESTING PROBLEMS 



149 



Mehnyk, Charlotte I^vy, Chunky Choc- 

Meiiwin. Jack C, Syracuse University 
Metz, Klliott, New School for Social 
Kem^arch 

MicfiAEu SU'pheii U., Educational Test- 
ing Servicf* 

MicfiAEi.. William H., University of 
Sfiutheni California 

MicHKi.i. (imie. Metropolitan iJfe Iii- 
Hurance (Company 

Mii.u Cyril H., nichmond Public Scht)«il.s 

Mii.LKR. Howard C;., Carnegie Institute 
of Technology 

M1LI.KTT, Father, WejiUiver Strhool, 
Middleliury, Omnecticut 

MiTCiiKi.i.. HI y the C, World Book Com- 
pany 

MiTZKi.. Ilurold K.. Division of Teacht-r 

F^iuration, New York City 
Moix, CUrent^e, Penn Military C<illege 
Moi.LKNKoPF, William <i, Edurntional 

I'esting Service 
MoRCAN, Donna D.. New York City 
MoRCAN, Henry H.. The Psychological 

Corp<iration 
M OR RIM. Nancy. FMucatioiiul Testing 

.Service 

MoRRiMfiN. J. Cayce. Puerto Wwmt Study 
Morton. AnUm, FMncational Testing 
Service 

Momki.y. Uus!m*II. Wiscon.sin Statv I)e- 
partnirnt of Public Instruction 

MtrRRAY. John K.. Special Device** 
Center. ONM 

Mykrm. Charlt's T . FJncational Testing 
Service 

Myerm. hoU-rt I... Temple University 
Myerm. Shehloii S.. F^incational Testing 
Serviet* 

Nki>m)N. Kenneth (i.. New York State 

l)i*(mrtnii*nt of Ivincatioii 
Nkmn. Margurt't. lulucatioiial Tenting 

St V iff 

Nkwm\n. Sidnp> II., Department »if 
Health. F^urution, and Welfare 

Nii.u Kathryn Usher. Silver Rurdrtt 
Company 

NoiJ^ VicUir H., Michigan Stat« Uni- 
^♦•rsity 



North, Robert D., Educational Records 
Bureau 

Nosow, Sigmund, Michigan State Uni- 
versity 

Olhen, Marjorie, Educational Testing 
Service 

ORLEANi), Beatrice S., Bureau of Ships, 

Navy Department 
Orlkans, Joseph B., (Ie<irge Washiiigt^jn 

High School, New York City 
Orr, David B., Teachers College, Co- 

lumbia University 
OzKAPTAN, Halim, Educational Research 

Corporation 
Pace, C. Robert, Syracuse University 
Palmer. Harold I., East Orange High 

School 

Paumeii, Orville, Educational Testing 

S.Tvice 

Patton. James B., Jr., Virginia SUte 

Department of Ekiucation 
Pearson, Richard, Educational Testing 

Service 

I^RUf AN, Mildred, Department of Per- 
sonnel, New York City 

PerijOFP, Robert, Science Research As- 
sociates 

Perry, W. D., University of North 
Carolina 

pETERSorf, Donald A., Life Insurance 
Agency Management Association 

P111LIJP8. I^ura M.. Silver Burdett Com- 
pany 

PiKBsoN. (JiHirge A., Queens Gillege 
PiNZKA, Charles F., Educational TesUng 
Service 

PiTCiiKR, Barbara, Educational Testing 
SiTvice 

Plumleb, Lynnettc B,, Eulucationnl 

Testing Service 
PoLiJiCK, Norman C. New York StaU* 

Department of Civil Service 
Pratt. Carroll C. IVinceton University 
QuiNN. Edward R.. University of Notre 

Dame 

Raoasch, John. California Test Bureau 
Raine, Walter J.. Educational Testing 
S<?rvice 

Rapparue, John H.. Owens-Illinois 
(ilasM Company 



^5 



150 1955 INVITATIONAL CONFERENCE 



Raiikin, Judith G., University of Maau- 
chusetts 

Reed, Anna K., New York SUte De- 
partment of Civil Service 

Regan, James J Special Devices Center, 
ONR 

Heid, John W., Veterans Administration 
SiEMMEfU), (I. H.. Purdue University 
Reppert, Flarold C, Temple University 
Ricnirrr* Henry N., University of Colo- 
rado Medical Schcmi 
Ricks. J. H., Jr., The Psychological 

Corporation 
RiEMHMAN, Frank. Bard College 
RiBiALDVER. Jack K.. Educational Test- 
ing Service 
MiviJN, Harry N.. Qu(m>uh College 
Mob BINS, Irving, Queens College 
HoNsBAUGBN, Raydon P., Kent School. 

Connecticut 
HoMKNZWHiG, Allana, Teachers College. 

r.f»)umbia University 
HoHiNNKi. Kdwin F.. University of 
Buffalo 

RoHNER, Benjamin, Teachers College, 

0>lumbia University 
HiJLoN, P. J., Har\ard University 
Sait. Kciward, Renwtaer Polytechnic 

Institute* 

Sandh, Elizabeth, Standard Oil Com- 
pany (New Jersey) 
Sanford, Nevitt, Vassar (College 
Sai ndkiw, D. R., hlducational Testing 
S<»rvice 

Sawin, Enoch I., Air Force 1U>TC 

Ht'ad quarters, Montgomery 
S<:ate«, Alice Y.. U. S. Oflice of F^lu- 

ration 

S<:HAPino, Han>ld B., Kmmland & Com- 
pany 

ScHEiDER. Bcise \V.. Kdiicatifmal Test- 
ing .S*»rvice 

ScHRADRR, \V. B., Fduratinnal Testing 
Service 

ScHRoRDEU C. International Busi- 
ness Machines Corporation 

ScHtrrr, Richard E., World Book Com- 
pany 

S<:fnT, C. Winfield, Vocational Cotinael- 
ing Service, Inc. 

'n; 



Sbashors, Harold, The Psychological 

Corporation 
Seibel, Dean W., Educational Testing 

Service 

Sporza, Richard F., New York State 

Department of Civil Service 
Sharp, Catherine, Educational Testing 

Service 

Shaycopt, Marion F., American Insti- 
tute for Research 

SiiiMBERO, Benjamin, Educational Test- 
ing Service 

Shover, Bertram P., Grosse Pointe Uni- 
versity School, Michigan 

SiLBERMAN, Harry F., City College of 
New York 

SiTGREAVEa, Rosedith, Teachers College. 
Columbia University 

SuiUGHTER, Robert E., McGraw-Hill 
B<M)k Company 

Smith, Alexander F.. New Haven State 
Teachers College 

Smith, Ann Z., Educational Testing 
Service 

Smith, Denzel D., Office of Naval Re- 
search 

Snodgrass, Robert, Educational Test- 
ing Service 

Snyder, Betty, Educational Testing 
Service 

SomMoN, Herbert, Teachers College, 

Columbia University 
SoMiMON, Robert, Educational Testing 

Service 

Souther, Mary Tayloe, Tower Hill 

School, Wilmington 
SpAtJi^iNO^ Cicraldine, Elducational Rec- 

r)rds Bureau 
Spaney, Emma, Queens College 
SpEER, George S., Illinois Institute of 

T<H:hnology 
Spkarritt, Donald, Harvard University 
Stake, Roliert Earl, Princeton Uni- 
versity 

Stalnaker, John M., National Merit 

Scholarship Corporation 
Stalnaker, Mrs. John M., National 

Merit Scholarship Corporation 
Stecklein, John E., University of 

Minnesota 



TESTING PROBLEMS 



151 



Stevens, William C, Veterans Adiniais^ 
tration Hoapital, Perry Point, Mary- 
land 

Stewart, Mary, Institute of Physical 
Medicine and Rehabilitation 

Stewart, Naomi, Educational Testing 
Service 

SncE, Gli»n. FUiucatidiial Testing Service 
Stodola, Quentin, Educational Testing 
Service 

Stoke«, Thomas M., Metropolitan life 

Insurance Cfimpany 
Stone, Paul T., Huntingdon College 
Stciuciiton, Robert W., Coniu-cticut 

State Department of Fxiucation 
Stovall, F. L., University of HoiLst<»n 
Stuart, William A., Kdurational Tej*t- 

ing Service 
Stui.Baum, Harold, Metropolitan life 

Insurance Company 
SiiPER, Donald K., Teaehers College, 

(Columbia University 
SwANHfiN, Edward ().. University of 

Minnesota 
SwiNEPORn, Frances, Fdurntional Test- 
ing Service 
Symondh, Percival M., Teachers ('ollege, 

Columbia U'niversity 
Tatsuoka, Maurices Harvard University 
Tayi^or, Justine, I'xiucational Testing 

Servioc 

Tkrf.ai., J. E., F^lurational Testing 
Scrvia5 

Thompson, Albt^rt S., Teachers College, 
Columbia University 

TiioMps^JN. Kathli^Mi. Fxiurational I^e- 
search Corporation 

TnoRNDiKK, Holx'rt L., Teachers College. 
Columbia U'niversity 

THirRHT«»NK, Thelma <!., University of 
North (Carolina 

TlFor.MVN. hH\id \ ,, llarxiiiil Uni- 
versity 

Tin K I K. J \\ ., Milchel Air Fiiree Hase 
Trail, Stanley M,, University i)f C*on- 
iiecticnt 

Traxi^k. '^rthnr K., Kdurational Him:- 

iirds S{tj*'»-fni 
Tnicus, Frances. <*omnntt*^*» o|i Di»g- 

noatic Reading Tests, Inc. 



Trover, Maurice E., Intematioaal 
Christian University, Tokyo 

Tucker, Ledyard R, Educational Test- 
ing Service 

TuRNBUix, William W.. Educational 
Testing Service 

TwYFORD, Loran C, Spt»cial Devices 
Center, ONR 

Upshall, Charles C, Eastman Kodak 
Company 

Valley, John, Educational T<-sting Serv- 
ice 

Van Clevk, William J., Educational 

Testing Service 
Vkckeuy, Verna, S)utheastern l»uisiana 

College 

N iteleh, Morris S.. U'niversity of Penn- 
sylvania 

VosE, John C. I'Mucational Testing 

S«Tvice 

Wadell, Btandena C, World Book 
Company 

Wagner, E. Paul, Teachers College. 

Bloonisburg, Pennsylvania 
Walker, Helen M., Teachers College, 

Columbia University 
Walsh, B. Thomas, Personnel Depart- 
ment, Philadelphia 
Walter, Charles, City of Philadelphia 
Walton, Wesley W., Educational Test- 
ing Service 
Wantman, M. J., University of Rochester 
Watkins, Richard W., Educational Test- 
ing Service 
Watson, Walter S., The Cooper Union 
Webster, Harold, Vassar College 
Weiss, Eleanor S., Educational Testing 
Service 

Weish, Joseph, Polylirchnic Institute of 
BnM>klvn 

Wkitz. Henry. Duke U^niversily 

W EsM ,\ N . MexniK ler ( t .. The Psyche »- 
logical CorjKiration 

W iiiTiJi, Dean K.. Harvard University 

Whitney. Alfred C. life Insurance 
Agency Management Association 

Wilcox, (]Ienn W., Boston University 
Junior College 

Wii.KK, Marguerite M.. Board of Educa- 
tion, (jreenwich, Connecticut 



147 



152 1955 INVITATIONAL CONFERENCE 



WiLKE, Walter H., New York Univeruity 
WiLKfl, S. S., Princeton University 
WiLLARD, Richard W., Harvard Uni- 
versity 

\ViLUA*w, Uobert J., Columbm Uni- 
versity 

WiLUAMs. Roger V , Mcran State 
Cijtlege, Baltimo) 

WiijiON. John T., ^>Lvion<4l Science Foun- 
dation 

Wilson. Kenneth M., Princeton Uni- 
versity 

Wilson, Phyllis C, Queens College 
WiNANS, S. David, New Jersey State 

Department of Education 
WiNco. Alfred L.. Virginia State Board 

of Education 
WiNTEhBOTTOM, J. A., Educational Test- 

itvg Service 



Wolf, Beverly, City College of Now 
York 

WoLM\N, Benjamin, City College of 

New York 
WoMBR, Frank B., Houghton Mifllin 

Company 

Wood, Ray G.* Ohio State Department 

of Education 
Wright, Wilbur H., Geneseo State 

Teachers College 
Wriohtstonb, J. Wayne, Bureau of 

Educational Research 
Zalkind, Sheldon S.. City College of 

New York 
ZfuiLRS, Herbert, Bank Street College 

of Education, New York 
ZuBiiv. Joseph, Columbia University 



